AI Research 78% 1 min readJul 3, 2026, 11:16 AM

GPT and Claude failed Bridgewater's finance tests because the right answers were never public

30-second summary

GPT and Claude failed Bridgewater's finance tests due to lack of publicly available correct answers. A finely tuned open-weight model outperformed them at a lower cost.

GPT and Claude failed Bridgewater's finance tests because the right answers were never public

Key takeaways

GPT and Claude failed Bridgewater's finance tests due to lack of publicly available correct answers
A finely tuned open-weight model outperformed more powerful AI models at a lower cost
Customized models can be more effective in specific domains like finance

Full story

Bridgewater, a hedge fund, and Thinking Machines Lab conducted an analysis to evaluate the performance of various AI models, including GPT and Claude, in assessing financial documents. The results showed that these models failed the tests because the correct answers were not publicly available.

The study highlights the limitations of current AI models in handling specialized domains like finance, where the correct answers may not be publicly known.

The analysis also found that a finely tuned open-weight model outperformed the more powerful AI models at a fraction of the cost. This suggests that customized models can be more effective in specific domains.

The findings have implications for the development and application of AI models in finance and other specialized fields.

Source: GPT and Claude failed Bridgewater's finance tests because the right answers were never public. Read the full piece at the source.

Why this matters

Developers

Highlights the need for customized models in specialized domains

Businesses

Impacts the development and application of AI models in finance

Investors

Informs investment decisions in AI and finance

Students

Everyone

Sources · 1