OpenAI Unveils GPT-4o, Its Omni-capable AI Assistant

OpenAI has unveiled its latest generative AI model, GPT-4o. The ‘o’ in GPT-4o stands for ‘omni’, highlighting the model’s ability to handle text, speech, and video inputs.

GPT-4o is set to be rolled out iteratively across OpenAI’s developer and consumer-facing products in the coming weeks. This new model not only matches the intelligence of its predecessor, GPT-4, but also enhances its capabilities across multiple modalities and media.

One of the key features of GPT-4o is its ability to reason across voice, text, and vision. This is a crucial step towards the future of interaction between humans and machines. For instance, GPT-4o can now quickly answer questions related to a photo or a desktop screen, ranging from software code analysis to identifying a brand of shirt in a picture.

GPT-4o also brings significant improvements to OpenAI’s AI-powered chatbot, ChatGPT. Users can now interact with ChatGPT more like an assistant, even interrupting it while it’s answering. The model delivers real-time responsiveness and can pick up on nuances in a user’s voice, generating responses in a range of different emotive styles, including singing.

In terms of multilingual capabilities, GPT-4o shows enhanced performance in around 50 languages. It’s also twice as fast as, half the price of, and has higher rate limits than its predecessor, GPT-4 Turbo.

Read more: techcrunch.com