New math benchmark reveals AI models confidently solve problems that have no solution
A new AI benchmark, SOOHAK, exposes the limitations of AI models in solving math problems, particularly in recognizing when a problem has no solution.

A consortium of 64 mathematicians has developed SOOHAK, a novel AI benchmark designed to test the capabilities of artificial intelligence models in solving mathematical problems. The benchmark comprises 439 handwritten tasks, of which 99 are deliberately unsolvable. This innovative approach aims to assess not only the ability of AI models to solve problems but also their capacity to recognize when a problem has no solution.
The results of the SOOHAK benchmark are revealing. Google's Gemini 3 Pro model leads the pack on research-level problems, achieving a score of 30 percent. However, even the top-performing models struggle to identify problems with no solution, with none cracking the 50 percent mark.
This disparity highlights a significant gap in the capabilities of current AI systems, which often excel in producing flashy results but lack the broad research skills necessary for more complex tasks. The SOOHAK benchmark also investigated the impact of computational power on AI models' performance. The findings suggest that increased compute resources improve models' ability to solve problems but do not enhance their capacity to admit when a problem has no answer.
This limitation underscores the need for more nuanced evaluation methods that go beyond mere problem-solving abilities. The development of SOOHAK is a significant step towards understanding the limitations of AI models in mathematical reasoning. By pinpointing the gap between solving problems and recognizing unsolvable ones, researchers hope to drive innovation in AI development.
Ultimately, this work aims to create more robust and reliable AI systems capable of tackling complex mathematical challenges. The implications of the SOOHAK benchmark extend beyond the realm of mathematics. As AI systems increasingly permeate various aspects of research and industry, it is crucial to ensure they possess the necessary skills to handle complex problems accurately.
By exposing the limitations of current AI models, SOOHAK paves the way for future advancements in AI development.
Source: The Decoder