CapitalBench

The benchmark for AI capital allocation

We give frontier AI models the same market brief, freeze their portfolios, and score them against real market returns.

Study how AI models behave in capital-allocation rounds, and which ones actually perform.

Read the CapitalBench Manifesto
Benchmark results

Model Performance

Current benchmarks rank models only inside equal-run comparison sets. Switch monthly/weekly.

All comparison sets

Current Monthly Benchmark

Shared resolved rounds

CapitalBench Score

Max possible = best eligible asset in each included round. Every ranked model has the same included rounds. Calculation.

Claude Opus 4.8
Grok 4.3
Claude Opus 4.7
GPT-5.5
Gemini 3.1 Pro
S&P 500
Max possible hindsight best asset

Benchmark score Higher benchmark score is better.

Claude Opus 4.8 Anthropic · 3/3 scored rounds
2.9
Grok 4.3 xAI · 3/3 scored rounds
1.0
Claude Opus 4.7 Anthropic · 3/3 scored rounds
-5.9
GPT-5.5 OpenAI · 3/3 scored rounds
-8.9
Gemini 3.1 Pro Google · 3/3 scored rounds
-21.9
S&P 500 S&P 500 · 3/3 scored rounds
-14.6
Max possible Hindsight best-performing eligible asset in each round, not a model portfolio
100.0
3 shared resolved rounds5 equal-run models rankedQualified at 3+ shared roundsNewest included round: CB-2026-06-01-1M
Return context

Average Return Details

Average portfolio return across the same finished rounds.

Return leader Claude Opus 4.8 0.45%
Anthropic Claude Opus 4.8
0.45%
xAI Grok 4.3
0.15%
Anthropic Claude Opus 4.7
-0.92%
OpenAI GPT-5.5
-1.37%
Google Gemini 3.1 Pro
-3.38%
S&P S&P 500
-2.25%
MAX Max possible
15.41%
Leader audit Claude Opus 4.8 2.9 = 1.36% total return / 46.23% oracle return × 100.
Rounds included: CB-2026-05-28-1M, CB-2026-05-29-1M, CB-2026-06-01-1M Fairness rule: every ranked model completed every included round. A missed round is excluded from this set for everyone.
Methodology Same report, same choices, real prices
Full methodology
  1. Step 1 Same report

    Every model reads the same market report.

  2. Step 2 Same choices

    Every model chooses from the same 70 assets.

  3. Step 3 First portfolio locks

    Each model's saved portfolio is frozen before results are known.

  4. Step 4 Fixed wait window

    The frozen portfolio sits untouched for 7 days or 1 month.

  5. Step 5 Prices score it

    Real ending prices decide which model did best.

AI positioning

What AI Models Are Allocating To Now

Live rounds only. This is the current market condition as expressed by frozen model portfolios, before final scores are known.

Historical risk trend
AI risk appetite As of June 29, 2026 77.8/100 Risk-seeking / Broad risk seeking
Consensus allocation As of June 29, 2026 Semiconductors (SMH) 28.5% average live weight
Risk shift As of June 29, 2026 +8.2 Change vs Jun 26 portfolios
Model agreement As of June 29, 2026 Mixed 9.5 point dispersion
Model behavior patterns

See Each AI Model's Allocation Personality

CapitalBench tracks whether each model behaves like a risk-seeker, concentrator, defensive allocator, consensus follower, or distinctive outlier across official frozen portfolios.

225 saved portfolios 113 resolved results Peer overlap, concentration, turnover, and risk appetite
Current allocation signal

AI Risk Appetite

Latest monthly and weekly portfolios, equal-weighted by track. This measures current model positioning, not market returns or a trading recommendation.

Historical trend and methodology
Combined pulse 77.8/100 Risk-seeking
Current regime Broad risk seeking CB-2026-06-29-1M and CB-2026-06-29-1W
Monthly strategic 76.3/100 Risk-seeking
Weekly tactical 79.3/100 Risk-seeking
Change +8.2 Change vs Jun 26 portfolios
Model agreement Mixed 9.5 point dispersion
Largest current allocations
Semiconductors (SMH) 28.5% Healthcare Sector (XLV) 16.0% Biotechnology (XBI) 8.5% Nasdaq 100 (QQQ) 8.0% Industrials Sector (XLI) 7.5% US Small-Cap Value (IWN) 5.0%
Regime mix
Growth and technology 49.0% Broad and cyclical equity 29.0% Defensive equity 19.0% Rates and credit 3.0%
Scope
View
All Open portfolios only
Live AI positioning

Semiconductors (SMH) is the largest live allocation.

18.9% points to Semiconductors (SMH), while US Equity accounts for 60.8% of open portfolios.

Largest assetSemiconductors (SMH)18.9%
Lead categoryUS Equity60.8%
Live rounds22All Open
Portfolios11235 assets held
Category mixClick a category to focus the pick list
Top allocationsAssets with the largest live model allocation
35 assets
25 smaller live allocations30.4%
Live tests

Live Portfolio Returns

Live rounds marked to the latest available close. These are not final scores.

Priced live rounds20 of 22
Latest closeJun 29
Next final scoreJun 30
Claude Fable 5Anthropic / 2 open
Portfolio+0.49%S&P 500+0.27%Portfolio Minus S&P 500+0.22%
Grok 4.3xAI / 20 open
Portfolio+0.37%S&P 500-0.02%Portfolio Minus S&P 500+0.39%
Claude Opus 4.7Anthropic / 20 open
Portfolio+0.19%S&P 500-0.02%Portfolio Minus S&P 500+0.21%
Claude Opus 4.8Anthropic / 20 open
Portfolio+0.08%S&P 500-0.02%Portfolio Minus S&P 500+0.10%
Gemini 3.1 ProGoogle / 20 open
Portfolio-0.24%S&P 500-0.02%Portfolio Minus S&P 500-0.22%
GPT-5.5OpenAI / 20 open
Portfolio-0.35%S&P 500-0.02%Portfolio Minus S&P 500-0.33%
S&P 50020 open tests
S&P 500 return-0.02%CloseJun 29

Interim returns use live rounds only. Completed rounds move to official scored results.

Marked to market from saved entry prices. Official results wait for the scheduled ending close.
Latest official results

Finished Benchmark Results

Switch between monthly and weekly results, then move backward or forward through completed rounds in that track.

All benchmark results
Monthly official results

Monthly results, newest official score first

Monthly result1 of 6
Monthly official result

Monthly result scored Jun 29

Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.

Scored
Model portfolios S&P 500 benchmark Maximum possible return
Claude Opus 4.8
Claude Opus 4.7
Grok 4.3
GPT-5.5
Gemini 3.1 Pro
S&P 500
Max
Claude Opus 4.8 Anthropic
1.37%
Claude Opus 4.7 Anthropic
0.01%
Grok 4.3 xAI
-0.19%
GPT-5.5 OpenAI
-1.00%
Gemini 3.1 Pro Google
-4.90%
S&P 500 Benchmark
-1.79%
Max possible XBI
15.93%
Portfolio context

Shows each model's saved portfolio weights.

Model portfolios

Ranked in the same order as the chart.

1
Claude Opus 4.8 Anthropic
SMH 30% XLK 25% EWT 15% ITA 15% SPY 15%
2
Claude Opus 4.7 Anthropic
SMH 30% IAU 25% ITA 15% MTUM 15% BIL 15%
3
Grok 4.3 xAI
QQQ 30% XLK 25% SMH 20% AIQ 15% MTUM 10%
4
GPT-5.5 OpenAI
AIQ 30% SMH 25% CIBR 20% EWY 15% XLK 10%
5
Gemini 3.1 Pro Google
SMH 30% EWY 30% AIQ 20% TAN 20%
Reference points

Not model portfolios.

S&P 500 Benchmark

Benchmark return over the same scoring window

Max possible XBI

100% Biotechnology (XBI) hindsight ceiling

Official scored round

Monthly result scored Jun 29

Audit ID: CB-2026-06-01-1M

ScoredJun 29WindowMay 29 to Jun 29Models5Asset choices70LeaderClaude Opus 4.8HorizonMonthly
Benchmark universe

What Models Allocate From

Models get the same report, choose from the same assets, and wait for weekly or monthly scoring.

Models 6
Asset choices 70
Round lengths 2
Live rounds 22
Historical model style

Historical Risk Style By Model

Allocation-weighted from every official frozen portfolio, including live and completed rounds. It does not use future returns and is separate from the current AI Risk Appetite signal.

225 saved portfolios
Model portfolios

Current Frozen Model Portfolios

These are the saved model portfolios for the newest monthly and weekly rounds. They are waiting for final prices.

Monthly model portfolios

CB-2026-06-29-1M

2026-06-29 to 2026-07-29

Waiting for result
Anthropic Claude Opus 4.7
Healthcare Sector (XLV) 25% Industrials Sector (XLI) 20% Semiconductors (SMH) 20% Equal-Weight S&P 500 (RSP) 20% Long-Term US Treasury Bonds (TLT) 15%
Anthropic Claude Opus 4.8
Semiconductors (SMH) 25% Industrials Sector (XLI) 20% Healthcare Sector (XLV) 20% Financials Sector (XLF) 20% S&P 500 (SPY) 15%
Google Gemini 3.1 Pro
Semiconductors (SMH) 40% Nasdaq 100 (QQQ) 30% Communication Services Sector (XLC) 15% Consumer Discretionary Sector (XLY) 15%
OpenAI GPT-5.5
Semiconductors (SMH) 40% Biotechnology (XBI) 20% Regional Banks (KRE) 15% Healthcare Sector (XLV) 15% US Small-Cap Value (IWN) 10%
xAI Grok 4.3
Healthcare Sector (XLV) 30% Biotechnology (XBI) 25% Industrials Sector (XLI) 20% US Small-Cap Value (IWN) 15% US Low Volatility Equities (SPLV) 10%
Shared top pick Semiconductors (SMH) Average across 5 frozen model portfolios.
Top 3 55% Spread 7.6 assets
Semiconductors (SMH) 25%
Healthcare Sector (XLV) 18%
Industrials Sector (XLI) 12%
Biotechnology (XBI) 9%
Weekly model portfolios

CB-2026-06-29-1W

2026-06-29 to 2026-07-06

Waiting for result
Anthropic Claude Opus 4.7
Healthcare Sector (XLV) 30% US Mid-Cap Stocks (IJH) 20% Equal-Weight S&P 500 (RSP) 20% Semiconductors (SMH) 15% Long-Term US Treasury Bonds (TLT) 15%
Anthropic Claude Opus 4.8
Healthcare Sector (XLV) 30% Biotechnology (XBI) 20% Industrials Sector (XLI) 15% Semiconductors (SMH) 15% US Low Volatility Equities (SPLV) 20%
Google Gemini 3.1 Pro
Semiconductors (SMH) 50% Communication Services Sector (XLC) 25% S&P 500 (SPY) 25%
OpenAI GPT-5.5
Semiconductors (SMH) 45% Biotechnology (XBI) 20% Regional Banks (KRE) 15% Nasdaq 100 (QQQ) 10% Healthcare Sector (XLV) 10%
xAI Grok 4.3
Nasdaq 100 (QQQ) 40% Semiconductors (SMH) 35% US Small-Cap Value (IWN) 25%
Shared top pick Semiconductors (SMH) Average across 5 frozen model portfolios.
Top 3 56% Spread 6.5 assets
Semiconductors (SMH) 32%
Healthcare Sector (XLV) 14%
Nasdaq 100 (QQQ) 10%
Biotechnology (XBI) 8%
1 Same report

Every model gets the same market report.

2 Same choices

Every model allocates from the same asset list.

3 Frozen portfolios

Model portfolios are locked before results are known.

4 Real prices score

After 7 days or 1 month, real prices decide the result.

Results

Monthly And Weekly Are Separate

A 1-month round and a 7-day round are different contests. They get separate scores and separate overall results.

  • Monthly 1-month round
  • Weekly 7-day round
  • No mixing Scores stay separate
Monthly track

Monthly Results

6 completed / 17 live
Current benchmark leader Claude Opus 4.8 2.9 score · 3 shared rounds
Latest scored CB-2026-06-01-1M Live round CB-2026-06-29-1M Next score After Jul 29 close
  1. Locked
  2. Live
  3. Scores
Weekly track

Weekly Results

17 completed / 5 live
Current benchmark leader Claude Opus 4.8 -12.5 score · 15 shared rounds
Latest scored CB-2026-06-22-1W Live round CB-2026-06-29-1W Next score After Jul 6 close
  1. Locked
  2. Live
  3. Scores
Live benchmark tests

Live Benchmark Tests

These are the open tests you can inspect now. Models already submitted portfolios; official scores wait for final closing prices.

Monthly test

One-month test

Live now

Longer test of AI allocation over one month.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 76.3/100 Top consensus Semiconductors (SMH) 25% average weight
Weekly test

One-week test

Live now

Short-term test of AI positioning over one market week.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 79.3/100 Top consensus Semiconductors (SMH) 32% average weight

Internal IDs and full reproducibility files are inside each audit packet.

Scoring calendar

Current Scoring Calendar

Models have already picked portfolios. Official scores publish only after the market window ends and final closing prices are available.

45 official rounds recorded Internal round and run IDs stay in the public audit trail.
View audit trail
Audit packet

Check The Public Audit Trail

Round pages show the report, prompt, model portfolios, starting prices, source reports, hashes, and result status behind each public benchmark round.

Why it is fair

Simple Rules, Public Audit Trail

CapitalBench keeps the comparison narrow: same report, same asset list, frozen portfolios, and no final result before the round ends.

Same rules

One frozen portfolio per model

The public score uses the saved portfolio, not private retries or experiments.

Same choices

70 current assets

Each round keeps the exact asset list, report, model output, starting prices, and audit hashes.

No early winner

22 live rounds waiting for results

Final results appear only after ending prices are available.