In a surprising turn of events, four competitors have emerged for OpenAI’s previously unchallenged GPT-4. These competitors include Gemini 1.5 by Google, Mistral Large by Mistral AI, Claude 3 by Anthropic, and Inflection-2.5 by Inflection AI. Each of these new models, which have surfaced over the past month, claims to rival the capabilities of GPT-4. Among them, Claude 3 stands out as the most promising contender according to researchers.
Bito, the developer of the Bito AI code-writing tool, compared Gemini 1.5 Pro with GPT-4 Turbo and found that Gemini 1.5 Pro outperformed GPT-4 Turbo in general inference and comprehension tasks, video understanding, and audio processing. On the other hand, GPT-4 Turbo excelled in solving complex mathematical problems, code generation, and image understanding compared to Gemini 1.5 Pro. The optimal choice between them depends on the specific requirements of the current task.
Following the release of Mistral Large by Mistral AI, discussions ignited on the OpenAI forum. Some researchers pointed out that Mistral Large falls short of GPT-4 in almost all benchmark tests, but its price is only 80% of GPT-4’s cost, raising questions about whether it’s worth considering a switch. Responses were divided, with one camp emphasizing that GPT-4 Turbo surpasses Mistral Large in both inference and complex problem-solving, generating more accurate answers, making the extra 20% well worth it. The other camp praised Mistral AI’s open-source policy.
KDnuggets, a content website focused on data science, machine learning, and AI, believes that Anthropic’s Claude 3 outperforms GPT-4 and Gemini Ultra in all LLM benchmark tests, establishing itself as a new leader in the field of AI. The most notable improvement in Claude 3.0 is its visual capabilities, enabling it to handle various visual formats such as photos, charts, images, and technical drawings.
However, despite its lead in benchmark tests, Claude 3 still lags behind GPT-4 and Gemini Ultra in terms of speed.
Discussions comparing GPT-4 Turbo and Claude 3 Opus on Reddit mostly lean in favor of Claude 3 Opus, citing its superior writing and article processing abilities compared to GPT-4 Turbo. Some argue that GPT-4 Turbo outperforms Claude 3 Opus in handling complex problems, while others claim that the code quality generated by Claude 3 Opus is on par with GPT-4 Turbo but more user-friendly.
Overall, many see a promising future for Claude 3 Opus.
Inflection AI introduced the Inflection-2.5 model for use in the Pi chatbot, a chatbot that emphasizes empathy. Inflection-2.5’s benchmark tests are close to GPT-4’s level but do not surpass it. However, the computational power required for training Inflection-2.5 is only 40% of that needed for GPT-4. Currently, there have been no specific comparisons between Inflection-2.5 and GPT-4.
Each of the aforementioned AI startups has a strong background, with Mistral AI’s founders having previously worked as AI researchers at Google DeepMind and Meta. Anthropic was founded by Dario Amodei, former Vice President of Research at OpenAI, and his sister Daniela Amodei, also a senior employee at OpenAI. Mustafa Suleyman, co-founder, and CEO of Inflection AI, is also a co-founder of DeepMind and previously worked at Google integrating AI into various Google products.
The emergence of multiple models capable of rivaling GPT-4 within a short period has surprised the AI community, showcasing the rapid growth of the AI field. It is believed that GPT-4, the current powerhouse, will soon become the average benchmark for large language models.