Reflection Math Problems

Mathematicians devised novel problems to challenge advanced AIs' reasoning skills — and they failed almost every test

Current AI models struggle to solve research-level math problems, with the most advanced AI systems we have today solving just 2% of the hundreds of challenges faced. When you purchase through links ...

The New York Times

OpenAI Unveils New ChatGPT That Can Reason Through Math and Science

Driven by new technology called OpenAI o1, the chatbot can test various strategies and try to identify mistakes as it tackles complex tasks. By Cade Metz Reporting from San Francisco Online chatbots ...

Ars Technica

New secret math benchmark stumps AI models and PhDs alike

On Friday, research organization Epoch AI released FrontierMath, a new mathematics benchmark that has been turning heads in the AI world because it contains hundreds of expert-level problems that ...

TechCrunch

DeepMind claims its newest AI tool is a whiz at math and science problems

Google’s AI R&D lab DeepMind says it has developed a new AI system to tackle problems with “machine-gradable” solutions. In experiments, the system, called AlphaEvolve, could help optimize some of the ...

Ars Technica

New study shows why simulated reasoning AI models don’t yet live up to their billing

There’s a curious contradiction at the heart of today’s most capable AI models that purport to “reason”: They can solve routine math problems with accuracy, yet when faced with formulating deeper ...

Reuters

Google clinches milestone gold at global math competition, while OpenAI also claims win

AI models solved math problems by processing them using natural language AI could soon tackle unsolved research problems, says math professor and former champion OpenAI self-published results before ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results