Anthropic's Claude 3 Opus Large Language Model (LLM) beats OpenAI's GPT-4 for the first time in Chatbot Arena.
“The king is dead,” software developer Nick Dobos wrote on X (Twitter) in a post comparing GPT-4 Turbo and Claude 3 Opus.
The king is dead
RIP GPT-4
Close work #1 EloHaiku beats GPT-4 0613 & Mistral large
That’s insane for how cheap & fast it is https://t.co/XWmvTE6h75 pic.twitter.com/fAwzJScLTH— Nick Dobos (@NickADobos) March 26, 2024
Chatbot Arena is a crowdsourced, open platform for evaluating large language models. To compile the rating, a large number of human reviews of the models’ performance are assessed using the Elo rating system. How the test works is that people enter a query and select the best answer from several options from different models. Based on thousands of user tests, the top is formed and ranked.
The Chatbot Arena leaderboard launched on May 3, 2023, and GPT-4 was included in the rankings on May 10. Since then, various variations of GPT-4 have consistently been at the top of the rankings. Still. Therefore, the emergence of a new leader in this area attracts attention. Moreover, one of Anthropic's smaller models, Haiku, also attracted attention for its performance on the leaderboard.
“For the first time, the best models available—Opus for complex tasks, Haiku for economy and efficiency—are available from a vendor that is not OpenAI,” said independent AI researcher Simon Willison. “It’s reassuring that we all benefit from the diversity of the industry’s leading suppliers. But GPT-4 has now been around for over a year, and it took this year for someone to catch up.”
Following Claude 3 Opus and two versions of GPT-4, the Bard (Gemini Pro) model from Google was placed in the ranking. However, if the difference in Elo points between the first three positions is insignificant (2-3 points), then Bard is already 45 points behind third place. All other competitors scored less than 1200 points.
Source: arstechnica
The competition for ITS authors continues. Write an article about the development of games, gaming and gaming devices and win a professional gaming wheel Logitech G923 Racing Wheel, or one of the low-profile gaming keyboards Logitech G815 LIGHTSYNC RGB Mechanical Gaming Keyboard!