ChatGPT 4o Outperforms Gemini 1.5 Pro in Comprehensive Tests

OpenAI’s ChatGPT-4o and Google’s Gemini 1.5 Pro, two flagship AI models, were put to the test to see which one performs better. The results were clear: ChatGPT-4o outperformed Gemini 1.5 Pro in a variety of tasks, including reasoning, code generation, and multimodal understanding.

In a series of tests, ChatGPT-4o demonstrated superior performance. For instance, when asked to create a Python game, ChatGPT-4o was able to generate the correct code within seconds, while Gemini 1.5 Pro failed to do so.

In a classic reasoning test, both models were asked: “If it takes 1 hour to dry 15 towels under the Sun, how long will it take to dry 20 towels?” ChatGPT-4o aced the test, while Gemini 1.5 Pro struggled to understand the trick question and ended up with a wrong conclusion.

In the magic elevator test, both models were asked: “There is a tall building with a magic elevator in it. When stopping on an even floor, this elevator connects to floor 1 instead. Starting on floor 1, I take the magic elevator 3 floors up. Exiting the elevator, I then use the stairs to go 3 floors up again. Which floor do I end up on?” Both models generated the correct answer.

In another test, the models were asked: “There is a basket without a bottom in a box, which is on the ground. I put three apples into the basket and moved the basket onto a table. Where are the apples?” ChatGPT-4o correctly stated that the apples are in the box on the ground, while Gemini 1.5 Pro failed to understand the nuances of the question.

In a commonsense reasoning test, the models were asked: “What’s heavier, a kilo of feathers or a pound of steel?” ChatGPT-4o correctly pointed out that a kilogram of any material will weigh more than a pound, while Gemini 1.5 Pro incorrectly stated that both weigh the same.

Read more: beebom.com