In a market as hard fought as smartphones, sometimes it makes sense to watch the competition. That’s particularly true when it comes to Google’s Android and Apple’s iOS in the realm of smartphone AI.
Google has already announced several important AI-based features for Android phones, while Apple is widely recognized as having fallen behind in the “AI on the smartphone” race. Tech industry observers expect that situation to change in about a month when Apple is predicted to introduce a raft of new generative AI-powered features at its Worldwide Developer’s Conference (WWDC).
In the meantime, however, Google continues to pad its lead and unveiled multiple new capabilities for Android this week at its own Google I/O developer conference. The company kicked off with an expansion of its already impressive circle-to-search function, which it first unveiled along with Samsung at that company’s Galaxy Unpacked event earlier this year.
In case you haven’t seen it, circle-to-search provides a very intuitive, graphical way to find anything currently displaying on your phone’s display. You long press on your Android phone’s home button and, as the name suggests, simply use your finger (or a stylus) to select an object or text on your phone’s screen with a circle or a scribble, then Google will automatically do a search on the highlighted item. You can even ask questions about the object to learn more about it.
It's a simple but very useful extension of your phone (or tablet) that leverages the AI processors in the latest devices to essentially “see” from the inside out what’s on your screen. More importantly, it’s the kind of experience that finally makes your smartphone feel, well, smart. After all, if you can see what’s on the screen, why shouldn’t it?
The latest extension to circle-to-search is a homework helper feature that, interestingly, seems to share a number of similarities with the latest editions to ChatGPT 4o that OpenAI just introduced. Google’s version can help with physics and math word problems that students are viewing on their device screens, explaining along the way how to solve them (and not just giving the answers). It’s a great example of how AI-powered features can bring incredibly useful new experiences to our phones and tablets.
Google enhances search:Google all in on AI and Gemini: How it will affect your Google searches
Google also described how it is more deeply integrating its Gemini generative AI models throughout Android. Google provided examples of how Gemini will let you do things like drag and drop AI generated images into documents, emails, and messages. In addition, thanks to Gemini’s summarization capabilities, you’ll be able to find the specific information you’re looking for in a video through a feature they’re calling “Ask this Video.”
One of the big trends that Apple is expected to highlight at WWDC is the ability to run large language models (LLMs), which power generative AI features, directly on iPhones. This will allow some applications to function on their own instead of having to go to the cloud. While that may not initially sound like a big deal, this approach offers several advantages, particularly with regard to privacy and even performance. To be clear, general-purpose searches will continue to need an external connection, but applications and experiences that leverage your own documents, emails, messages, etc., can be done solely on the device, preventing the possible exposure of private information.
Google recognizes those benefits as well and pointed out with their Gemini Nano model being built into the next version of Android, they will be the first mobile operating system to do so. More importantly, Google also said they will be bringing a multimodal version of Gemini Nano—meaning one that recognizes spoken language, sound and camera inputs in addition to text—to Android later this year. This should open up a dramatically enhanced set of experiences and enable the creation of powerful and intelligent digital assistants that can understand and intelligently respond to your requests. In fact, Google’s teased their intriguing and compelling vision for what a digital assistant can be, at the I/O event via their Project Astra.
Google also demonstrated some AI applications that go well beyond the things we’ve seen generative AI typically used for. The new TalkBack feature, for example, which will leverage the Gemini Nano multimodal features, can describe images to people who are sight impaired. The company also showed a Scam Detection feature that can listen to a phone conversation you’re having and warn you if it believes it’s a type of fraud. While some people may be understandably concerned about an AI-powered agent monitoring a conversation, the process only occurs on the device. (It’s also a good example of why running certain applications only on the device is so important.)
For Android phone users, expect to see these kinds of AI-powered features rolled out to newer generation devices throughout the course of the year. For iPhone owners, Apple will be doing its own set of AI-powered features, but there’s a good chance that many of them will be similar to what Google announced. In fact, it’s even rumored that Apple may be licensing some technologies from both Google and OpenAI to integrate into its next version of iOS.
Regardless, it’s clear that we are entering into an exciting new era of truly “smart” devices with AI-powered features that should make the experience of using them both more intuitive and more rewarding.
USA TODAY columnist Bob O'Donnell is the president and chief analyst of TECHnalysis Research, a market research and consulting firm. You can follow him on Twitter @bobodtech.