Google’s annual developer conference, Google I/O 2024, has been a hub of major announcements, with Artificial Intelligence (AI) taking center stage. Here’s a roundup of the key revelations.
1. Gemini
Google unveiled Gemini 1.5 Flash, the fastest Gemini model served in the API. It’s a cost-efficient alternative to Gemini 1.5 Pro, yet highly capable. Gemini 1.5 Pro, launched in February, has been upgraded to provide better-quality responses in areas like translation, reasoning, coding, and more. Google is also previewing a two million context window in both Gemini 1.5 Pro and Gemini 1.5 Flash.
Gemini Advanced and Gemini Nano
Gemini 1.5 Pro, with its 1 million context window, will be available for consumers in Gemini Advanced. This will allow users to get AI assistance on large bodies of work, such as 1,500-page long PDFs. Gemini Nano, designed to run on smartphones, now includes images in addition to text.
Gemma 2 and PaliGemma
The Gemini sister family of models, Gemma, is getting a major upgrade with the launch of Gemma 2. The next generation of Gemma has been optimized for TPUs and GPUs and is launching at 27B parameters. PaliGemma, Google’s first vision-language model, is also being added to the Gemma family of models.
2. Google Search
AI Overview Feature: Previously limited to Search Labs, this feature is now available to everyone in the U.S. It uses a new Gemini model to provide conversational, abridged answers to search queries.
AI-Organized Results Page: This feature uses AI to create unique headlines that better suit user’s search needs. It will initially roll out for English-language searches in the U.S.
New Search Features in Search Labs: Users will soon be able to adjust their AI overview to suit their preferences and use video for searches. Search can also plan meals and trips with users.
3. Veo (Text-to-Video Generator)
Google’s new model, Veo, can generate high-quality videos that closely represent the user’s vision. It’s currently available for select creators as a private preview inside VideoFX.
4. Imagen 3
Google’s next-generation text-to-image generator, Imagen 3, produces high-quality images with more details and fewer artifacts. It’s available in private preview inside Image FX for select creators.
5. SynthID Updates
Google is expanding its SynthID technology that watermarks AI images to include text and video.
6. Ask Photos
Using Gemini, users can use conversational prompts in Google Photos to find images. This feature will roll out later this summer.
7. Gemini Advanced Upgrades
Google is introducing Gemini 1.5 Pro, which allows users to upload larger materials, and Gemini Live, a new mobile experience where users can have full conversations with Gemini. Users will also be able to use their camera with Live, giving Gemini context of the world around them for conversations.
Gems for Gemini: Similar to OpenAI’s GPTs, users can create custom versions of Gemini to suit different purposes.
Planning Experience in Gemini Advanced: In the upcoming months, Gemini Advanced will include a new planning experience that can help users get detailed plans that take into account their own preferences.
8 . AI in Android
Circle to Search: This feature allows users to perform a Google search by circling images, videos, and text on their phone screen. It can now assist students with homework by walking them through equations and math problems when they circle them. The feature will work with topics ranging from math to physics and will eventually be able to process complex problems like symbolic formulas, diagrams, and more.
Gemini as Default AI Assistant: Gemini will replace Google Assistant, becoming the default AI assistant across Android phones. It can be accessed with a long press of the power button. Eventually, Gemini will be overlayed across various services and apps, providing multimodal support when requested.
Gemini Nano’s Multimodal Capabilities: These will be leveraged through Android’s TalkBack feature, providing more descriptive responses for users who experience blindness or low vision.
Spam Call Detection: If a user accidentally picks up a spam call, Gemini Nano can listen in and detect suspicious conversation patterns. It will then notify the user to either “Dismiss & continue” or “End call.” This feature can be opted into later this year.
Read more: www.zdnet.com