The 2,500 questions that make up the exam are specifically designed to probe the outer limits of what today’s AI systems cannot do.
The preprint repository’s requirement could boost the use of large language models to translate scientific texts.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results