Discover how these powerhouse models stack up against each other. 🍎🥤
Last week, AI startup OpenAI and tech giant Google went head-to-head with their respective events: OpenAI’s Spring Update and Google’s I/O developer conference. Both companies revealed advancements in their generative AI models, focusing on enhancements in their interfaces and functionalities.
OpenAI introduced GPT-4o, their latest AI model capable of real-time reasoning across audio, vision, and text. This model is designed to streamline user interactions by reducing the number of tokens required for various languages, thereby improving efficiency and performance​ . Meanwhile, Google unveiled updates to their Gemini AI, aiming to enhance its capabilities and user experience.
Understanding the differences between these models, such as OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro, can be complex due to the specialized terminology used, like tokens and parameters. To get a better grasp, CNET’s Imad Khan offers reviews comparing AI chatbots, detailing the strengths and weaknesses of each​.
For a deeper insight, I consulted a tech executive with 30 years of experience (who preferred to remain anonymous). He helped clarify the distinctions and implications of these updates, providing a seasoned perspective on the evolving landscape of generative AI.
“In my head, it’s like Coke and Pepsi. You know what I mean?” he said.
Here’s what he means:
Coke and Pepsi are both colas, but their distinct formulas result in different tastes. Similarly, GPT-4o and Gemini 1.5 Pro are advanced language models crafted to understand and generate human-like text responses based on given prompts. However, the responses from ChatGPT will not be identical to those from Gemini.
Though similar, they have unique integrations: one works seamlessly with Microsoft products and operates independently, while the other is designed for Google.
Both models offer free and subscription-based versions. ChatGPT Plus and Gemini Advanced each cost $20 per month, providing access to the latest models and additional features.
Since ChatGPT’s launch in late 2022, a competitive race in generative AI has ensued. Companies like Anthropic, alongside giants like Google and Microsoft, continuously update their chatbots and explore advancements in video, audio, and gaming.
Just as soda preferences vary, the choice between generative AI models depends on individual needs and preferences, influenced by each platform’s branding and marketing.
Comparing GPT-4o and Gemini 1.5 Pro, we see differences in context windows and parameters. Google’s Gemini 1.5 Pro recently expanded to a 1 million token context window, aiming for 2 million tokens later this year. In contrast, GPT-4o and GPT-4 offer context windows of 128,000 tokens, which affect how much text the model can consider when generating responses.
Regarding parameters, which determine the model’s processing ability, Google has not disclosed specifics for Gemini, with estimates ranging from 1.6 trillion to 175 trillion parameters. OpenAI’s GPT-4o is said to bring GPT-4-level intelligence, which reportedly uses 1.8 trillion parameters.
In terms of information access, Gemini’s internet connectivity gives it an edge over earlier models like GPT-3.5. However, GPT-4o benefits from recent data and partnerships, such as those with Reddit and News Corp, making the comparison less straightforward.
Language support differs too, with GPT-4o available in 50 languages and Gemini 1.5 Pro in 35, though Google’s extensive experience with Google Translate may enhance its multilingual capabilities.
Both models have introduced new conversational interfaces. GPT-4o allows for voice interaction and live video sharing, while Google’s Gemini Live also supports real-time conversation interruptions​.