News

FrontierMath math test has not yet been passed by any AI like ChatGPT – UNIAN

admin2 weeks ago

0 4 1 minute read

FrontierMath became a real challenge for ChatGPT and Gemini.

Neural networks can’t cope with FrontierMath / Photo – DALLE

It seems that we are still a long way from the technological singularity. Researchers from the Epoch AI organization presented a new mathematical benchmark, FrontierMath, which even the most advanced artificial intelligence models cannot yet cope with.

FrontierMath contains many complex mathematical expressions. Models Claude 3.5 Sonnet, GPT-4o, o1-preview and Gemini 1.5 Pro solve less than two percent of problems. At the same time, during testing, the AI has full access to the Python environment for calculations and debugging. By comparison, in older benchmarks like GSM8K or MATH, the models solve more than 90% of the equations correctly.

The main feature of FrontierMath is that the problems from it have not been published anywhere before, that is, neural networks could not be trained in advance to solve such expressions.

Source link

admin2 weeks ago

0 4 1 minute read

Related Articles

Middle East – Israel destroyed a nuclear weapons research center in Iran – UNIAN

Forecasters reported on the surprises that winter 2024 has in store

Real Madrid Milan – Champions League match result and review – Sport

TSN news 14:00 November 26. Winter forecast from GUR. An important NATO resolution

Leave a Reply Cancel reply