DeepMind Unveils AI Pointer for Hands-On Digital Interaction
Google's DeepMind division has developed an AI-enabled pointer feature aimed at reducing text-heavy prompts by allowing users to interact with digital content through pointing.
The new technology captures both the visual and semantic context around the pointer, allowing the computer to understand exactly what the user is referencing—and their intent. The system supports natural, shorthand commands that combine pointing, speech, and on-screen context.
Key Functionality
- The AI interprets where you point and why, enabling seamless, intuitive control.
- Users can pair gestures with simple voice commands for faster, more natural interaction.
Use Cases
- Pointing to a PDF to request a bullet-point summary ready for email insertion.
- Hovering over a table of statistics to generate a pie chart.
- Highlighting a recipe to ask for the ingredients to be doubled.
- Pointing to a paused frame in a travel video to receive a direct booking link.
Availability
Demos are currently available in Google's AI Studio. The ability to use the pointer with Gemini in Chrome is rolling out soon.
Examples in Action
Users can select products on a page to compare features, or point to a location to visualize how a new couch would look in the space.