OpenAI, the creator of the viral ChatGPT, has launched a new AI model, GPT-4o, that can interact with the world through audio, vision, and text in real-time. This latest flagship product, backed by Microsoft, aims to provide users with a more natural human-computer interaction.
During a presentation on Monday, OpenAI demonstrated that GPT-4o can respond to queries in under a third of a second, matching human conversational response times.
Utilizing a smartphone’s camera and microphone, GPT-4o can understand audio and visual inputs and respond with a personalized and natural voice.
OpenAI CEO Sam Altman described the new technology as “magical” and the best computer interface he has ever used, stating it feels like AI from the movies.
“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” he wrote.
“The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful.”
“We tested both pre-safety-mitigation and post-safety-mitigation versions of the model, using custom fine-tuning and prompts, to better elicit model capabilities,” the company explained in a blog post introducing the product.
OpenAI plans to offer GPT-4o for free, making it available within the next few weeks. To prevent misuse or potential harm, the company conducted extensive testing covering cybersecurity and psychology. They tested the model before and after safety mitigations and employed over 70 external experts to identify and mitigate risks related to social psychology, bias, fairness, and misinformation.
Despite its advancements, GPT-4o has limitations that OpenAI aims to address in future versions. Some mistakes include switching languages without prompting, errors in language translation, and mispronouncing names.
This announcement comes just before Google I/O, Google’s major annual event expected to focus heavily on AI. According to Leo Gebbie, a principal analyst at CSS Insight, Google needs to clearly convey the benefits of AI to avoid consumer fatigue.