The Scientific Models

The next AI breakthrough won’t come from bigger models, but from better data

Just as with LLMs, success in other frontiers of AI will require access to large volumes of high-quality data. That will ...

Nature

Vision language models excel at perception but struggles with scientific reasoning

A benchmark — MaCBench — is developed for evaluating the scientific knowledge of vision language models (VLMs). Evaluation of leading VLMs reveals that they excel at basic scientific tasks such as ...

Nature

SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models

Table 1 Comparison of SciCUEval with existing benchmark datasets. By cohesively unifying breadth of domain coverage, diversity of data modalities, and depth of reasoning evaluation, SciCUEval offers a ...

The Conversation

5 forecasts early climate models got right – the evidence is all around you

Climate models are complex, just like the world they mirror. They simultaneously simulate the interacting, chaotic flow of Earth’s atmosphere and oceans, and they run on the world’s largest ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results