Alibaba Group Holding continues to push hard into the AI space. This week, the e-commerce giant released several large language models (LLMs) called Qwen2-Math, which are designed to solve complex mathematical problems and are said to be better than AI algorithms from other companies.
In total, three large language models were presented, which differ from each other in the number of parameters that affect the accuracy of the algorithm’s answers. The model with the largest number of parameters, Qwen2-Math-72B-Instruct, according to the developers, outperforms many AI algorithms in terms of solving mathematical problems, including GPT-4o from OpenAI, Claude 3.5 Sonnet from Anthropic, Gemini 1.5 Pro from Google, and Llama-3.1-405B from Meta✴ Platforms.
“Over the past year, we’ve done a lot of work exploring and expanding the logical capabilities of large language models, with a particular focus on their ability to solve arithmetic and mathematical problems. We hope that Qwen2-Math will contribute to the community’s efforts to solve complex mathematical problems,” the developers said in a statement.
The Qwen2-Math language models were tested using various benchmarks, including GSM8K (8,500 complex and diverse school-level math problems), OlympiadBench (a high-level bilingual multimodal scientific benchmark), and Gaokao (one of the most difficult university entrance math exams). It is noted that the new models have some limitations due to the “English-only support.” In the future, the developers plan to create bilingual and multilingual LLMs.
If you notice an error, select it with your mouse and press CTRL+ENTER.