Google and OpenAI have announced significant updates for their AI models and features, creating more competition for Apple ahead of WWDC.
Apple will has a lot of catching up to do if it wants to compete with Google and OpenAI
On Monday, OpenAI announced its innovative GPT-4o AI model and an all-new Mac app, while Google previewed major improvements to its Gemini software the on Tuesday. The two companies showcased a variety of remarkable features, making the market even more competitive as a result.
While Apple has seemingly fallen far behind in its AI endeavors, a partnership with Google or OpenAI could prove to be an easy way of offering generative AI features to its user base. At least rumors suggest that's a path Apple is willing to take.
OpenAI updates
OpenAI recently introduced a GPT-4o, a new multi-modal version of the company's GPT AI model which contains enhanced capabilities in processing different input types.
Unlike its predecessors, GPT-4o will be able to utilize one neural network to process audio, images and text, offering significant improvements models as a result. Increases in speed and language processing were also touted during the product announcement.
OpenAI's GPT-4o will be able to understand and convey emotions. During the company's recent event, team members demonstrated this by asking the model to analyze facial expressions and determine the specific emotions a user was expressing.
OpenAI's ChatGPT is now officially available on macOS
With the improved Voice Mode feature, which provides audio output in the form of speech, GPT-4o can adjust the tone of its voice, making it more robotic or more natural depending on the user's request.
The company has also launched a new desktop application for ChatGPT, which is available on macOS, and has introduced a new API for developers. GPT-4o will be available to users through a gradual rollout process,
Google's Gemini updates
Google, at its I/O developer conference on Tuesday, revealed a multitude of enhancements to its Gemini model. The new-and-improved Google Gemini will be able to understand more complex user input, images while taking into account the context behind them.
Google Gemini is a generative AI tool
The AI software will feature new context-aware capabilities, meaning that it can see everything on screen, whether it's a PDF, a video, or a series of text messages. Gemini will be able to gather information and generate output, but only on select Android devices.
With its new Circle to Search option, for instance, users will be able to select individual objects within an image and instantly receive Google Search results about said object.
Another feature available exclusively on Android will provide users with the option to analyze YouTube videos and PDFs via Gemini Advanced. With the paid service, users will be able to ask specific questions, and will receive answers taken from the content of said video or PDF.
Google's updated Gemini will be able to summarize lengthy conversations and isolate key information from documents, images and videos, all of which should be greatly beneficial to its end-users. Apple is pursuing similar features via its own products.
What we know about Apple's AI strategy so far
Apple is noticeably behind the competition when it comes to its AI offerings, but that could all change very soon with the announcement of iOS 18 in early June.
For well over a year, Apple has been working on its in-house large language model (LLM) known as Ajax. With its generative AI software, the company aims to offer new features similar to those announced by Google and OpenAI in early May.
As part of its recent AI push, Apple is expected to introduce several AI-powered features across its new operating systems. Document and webpage analysis, text summarization, image captioning, and response generation are all in the works.
The company seeks to embed generative AI technology into its existing assortment of core system applications. As a result, apps like Notes, Safari, Messages, Mail, Siri and Spotlight Search are all expected to receive AI-enabled enhancements in one way or another.
Apple's Ajax LLM will improve Safari, Spotlight and Messages
In terms of actual functionality, however, there are limits to what Apple has been able to achieve. The on-device AI model in testing is only capable of rudimentary text analysis and basic on-device response generation.
More advanced features will seemingly necessitate cloud-based processing, which is why Apple is reportedly looking to establish a licensing arrangement with OpenAI. This would allow Apple to offer a variety of AI-related enhancements which its own on-device models cannot facilitate.
A separate rumor claims that Apple wants to create an "AI App Store" through which users could purchase AI-themed applications and products from other companies. This would, in theory, give users the option to use paid versions of products, such as Gemini Advanced.
We will gain a better understanding Apple's AI endeavors soon enough, as the company is expected to debut its new generative AI features at its annual Worldwide Developers' Conference on June 10.