CapitalBench

The benchmark for AI capital allocation

We give frontier AI models the same market brief, freeze their portfolios, and score them against real market returns.

Study how AI models behave in capital-allocation rounds, and which ones actually perform.

Read the CapitalBench Manifesto

View latest AI positioning Get score alerts Request API access

Model behavior patterns Distinct allocation personalities

6 models

GPT-5.5 OpenAI Aggressive upside hunter Highest risk-takingLowest turnoverTechnology tiltInternational tilt

Grok 4.3 xAI High-conviction concentrator Technology tiltOften different from peersRisk 81.6/100Top holding 37.3%

Gemini 3.1 Pro Google High-conviction concentrator Most concentratedTechnology tiltBinary resultsOften different from peers

Claude Opus 4.8 Anthropic Balanced allocator Balanced profileRisk 74.1/100Top holding 30.0%Tech tilt 32.5%

Claude Opus 4.7 Anthropic Risk-managed allocator Most consensus-alignedBinary resultsOften different from peersRisk 73.7/100

Claude Fable 5 Anthropic Early sample Early sampleMost defensive ballastMost distinctiveReal-asset tilt

See full behavior

Benchmark results

Model Performance

Current benchmarks rank models only inside equal-run comparison sets. Switch monthly/weekly.

All comparison sets

Shared resolved rounds

CapitalBench Score

Max possible = best eligible asset in each included round. Every ranked model has the same included rounds. Calculation.

Claude Opus 4.8

Grok 4.3

Claude Opus 4.7

GPT-5.5

Gemini 3.1 Pro

S&P 500

Max possible hindsight best asset

Benchmark score Higher benchmark score is better.

Claude Opus 4.8 Anthropic · 3/3 scored rounds

2.9

Grok 4.3 xAI · 3/3 scored rounds

1.0

Claude Opus 4.7 Anthropic · 3/3 scored rounds

-5.9

GPT-5.5 OpenAI · 3/3 scored rounds

-8.9

Gemini 3.1 Pro Google · 3/3 scored rounds

-21.9

S&P 500 S&P 500 · 3/3 scored rounds

-14.6

Max possible Hindsight best-performing eligible asset in each round, not a model portfolio

100.0

3 shared resolved rounds5 equal-run models rankedQualified at 3+ shared roundsNewest included round: CB-2026-06-01-1M

Return context

Average Return Details

Average portfolio return across the same finished rounds.

Return leader Claude Opus 4.8 0.45%

Claude Opus 4.8

0.45%

Grok 4.3

0.15%

Claude Opus 4.7

-0.92%

GPT-5.5

-1.37%

Gemini 3.1 Pro

-3.38%

S&P S&P 500

-2.25%

MAX Max possible

15.41%

Leader audit Claude Opus 4.8 2.9 = 1.36% total return / 46.23% oracle return × 100.

Rounds included: CB-2026-05-28-1M, CB-2026-05-29-1M, CB-2026-06-01-1M Fairness rule: every ranked model completed every included round. A missed round is excluded from this set for everyone.

Evidence context

How Much Evidence Is Behind These Scores?

Generated from completed rounds, benchmark-set rules, protocols, and available baselines so every score carries its own context.

Score methodology

Monthly benchmarkMore established

Evidence levelMore establishedMonthly evidence has enough completed rounds for stronger pattern reads, while still needing ongoing live validation.

Monthly evidence6 resolved rounds / 27 model resultsCurrent threshold met at 3+ rounds

Equal-run comparison5 models on the same 3 roundsRanked models are compared only on rounds every model in the roster completed.

ProtocolMixed protocolCompleted history includes 5 portfolio, 1 single-pick, and 0 unlabelled rounds.

Score scaleOracle-relative100 means matching the hindsight best asset in the same scored window.

Baselines shownS&P 500, Cash, Oracle, AI consensus portfolioPractical references are shown beside the impossible hindsight ceiling when available.

Use this as benchmark evidence, not an investable strategy result. More resolved rounds are needed before making strong performance claims.

Weekly benchmarkMore established

Evidence levelMore establishedWeekly evidence has enough completed rounds for stronger pattern reads, while still needing ongoing live validation.

Weekly evidence17 resolved rounds / 86 model resultsCurrent threshold met at 6+ rounds

Equal-run comparison5 models on the same 15 roundsRanked models are compared only on rounds every model in the roster completed.

ProtocolPortfolio-onlyCompleted rounds use constrained multi-asset portfolios.

Score scaleOracle-relative100 means matching the hindsight best asset in the same scored window.

Baselines shownS&P 500, Cash, Oracle, AI consensus portfolioPractical references are shown beside the impossible hindsight ceiling when available.

Use this as benchmark evidence, not an investable strategy result. More resolved rounds are needed before making strong performance claims.

Methodology Same report, same choices, real prices

Full methodology

Step 1 Same report
Every model reads the same market report.
Step 2 Same choices
Every model chooses from the same 70 assets.
Step 3 First portfolio locks
Each model's saved portfolio is frozen before results are known.
Step 4 Fixed wait window
The frozen portfolio sits untouched for 7 days or 1 month.
Step 5 Prices score it
Real ending prices decide which model did best.

Model behavior patterns Distinct allocation personalities

6 models

GPT-5.5 OpenAI Aggressive upside hunter Highest risk-takingLowest turnoverTechnology tiltInternational tilt

Grok 4.3 xAI High-conviction concentrator Technology tiltOften different from peersRisk 81.6/100Top holding 37.3%

Gemini 3.1 Pro Google High-conviction concentrator Most concentratedTechnology tiltBinary resultsOften different from peers

Claude Opus 4.8 Anthropic Balanced allocator Balanced profileRisk 74.1/100Top holding 30.0%Tech tilt 32.5%

Claude Opus 4.7 Anthropic Risk-managed allocator Most consensus-alignedBinary resultsOften different from peersRisk 73.7/100

Claude Fable 5 Anthropic Early sample Early sampleMost defensive ballastMost distinctiveReal-asset tilt

See full behavior

AI positioning

What AI Models Are Allocating To Now

Live rounds only. This is the current market condition as expressed by frozen model portfolios, before final scores are known.

Historical risk trend

AI risk appetite As of June 29, 2026 77.8/100 Risk-seeking / Broad risk seeking

Consensus allocation As of June 29, 2026 Semiconductors (SMH) 28.5% average live weight

Risk shift As of June 29, 2026 +8.2 Change vs Jun 26 portfolios

Model agreement As of June 29, 2026 Mixed 9.5 point dispersion

Benchmark insights

What The Latest AI Decisions Suggest

A compact readout from the insight engine, focused on current positioning, live marks, model agreement, and latest official results.

Full insight feed

Current PositioningAs of Jun 29

Latest live portfolios2 live rounds10 modelsLive portfolios

Live AI portfolios are concentrated in Semiconductors (SMH)

Across the newest live weekly and monthly portfolios, Semiconductors (SMH) is the largest aggregate allocation at +28.50%.

Aggregate allocation averages the newest live model portfolios before final scores are known.

High confidenceMath: deterministicData through Jun 29, 2026

Aggregate Live Allocation: +28.5%

Risk RegimeAs of Jun 29

Latest live portfolios2 live rounds10 modelsLive portfolios

Live AI risk posture is risk-seeking

The newest live portfolios have a deterministic risk-taking score of 77.8 out of 100.

Risk-taking score is allocation-based, not performance-based: higher means more weight in growth, momentum, cyclical, and higher-risk assets.

High confidenceMath: deterministicData through Jun 29, 2026

Live Risk Taking Score: 77.8/100

Horizon AgreementAs of Jun 29

Latest live portfolios2 live rounds10 modelsLive portfolios

Weekly and monthly AI portfolios both favor growth and technology

The newest weekly portfolios allocate +55.00% to growth and technology, while the newest monthly portfolios allocate +43.00%.

Horizon agreement compares the newest weekly and monthly live portfolios to see whether short- and longer-window model stances line up.

High confidenceMath: deterministicData through Jun 29, 2026

Weekly Top Regime Allocation: +55.0%
Monthly Top Regime Allocation: +43.0%

Current allocation signal

AI Risk Appetite

Latest monthly and weekly portfolios, equal-weighted by track. This measures current model positioning, not market returns or a trading recommendation.

Historical trend and methodology

Combined pulse 77.8/100 Risk-seeking

Current regime Broad risk seeking CB-2026-06-29-1M and CB-2026-06-29-1W

Monthly strategic 76.3/100 Risk-seeking

Weekly tactical 79.3/100 Risk-seeking

Change +8.2 Change vs Jun 26 portfolios

Model agreement Mixed 9.5 point dispersion

Largest current allocations

Semiconductors (SMH) 28.5% Healthcare Sector (XLV) 16.0% Biotechnology (XBI) 8.5% Nasdaq 100 (QQQ) 8.0% Industrials Sector (XLI) 7.5% US Small-Cap Value (IWN) 5.0%

Regime mix

Growth and technology 49.0% Broad and cyclical equity 29.0% Defensive equity 19.0% Rates and credit 3.0%

Live AI positioning

Semiconductors (SMH) is the largest live allocation.

18.9% points to Semiconductors (SMH), while US Equity accounts for 60.8% of open portfolios.

Largest assetSemiconductors (SMH)18.9%

Lead categoryUS Equity60.8%

Live rounds22All Open

Portfolios11235 assets held

Category mixClick a category to focus the pick list

Top allocationsAssets with the largest live model allocation

35 assets

25 smaller live allocations30.4%

Live tests

Live Portfolio Returns

Live rounds marked to the latest available close. These are not final scores.

Priced live rounds20 of 22

Latest closeJun 29

Next final scoreJun 30

Claude Fable 5Anthropic / 2 open

Portfolio+0.49%S&P 500+0.27%Portfolio Minus S&P 500+0.22%

Grok 4.3xAI / 20 open

Portfolio+0.37%S&P 500-0.02%Portfolio Minus S&P 500+0.39%

Claude Opus 4.7Anthropic / 20 open

Portfolio+0.19%S&P 500-0.02%Portfolio Minus S&P 500+0.21%

Claude Opus 4.8Anthropic / 20 open

Portfolio+0.08%S&P 500-0.02%Portfolio Minus S&P 500+0.10%

Gemini 3.1 ProGoogle / 20 open

Portfolio-0.24%S&P 500-0.02%Portfolio Minus S&P 500-0.22%

GPT-5.5OpenAI / 20 open

Portfolio-0.35%S&P 500-0.02%Portfolio Minus S&P 500-0.33%

S&P 50020 open tests

S&P 500 return-0.02%CloseJun 29

Interim returns use live rounds only. Completed rounds move to official scored results.

Marked to market from saved entry prices. Official results wait for the scheduled ending close.

Latest official results

Finished Benchmark Results

Switch between monthly and weekly results, then move backward or forward through completed rounds in that track.

All benchmark results

Monthly official results

Monthly results, newest official score first

Monthly result1 of 6

Monthly official result

Monthly result scored Jun 29

Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.

Scored

Model portfolios S&P 500 benchmark Maximum possible return

Claude Opus 4.8

Claude Opus 4.7

Grok 4.3

GPT-5.5

Gemini 3.1 Pro

S&P 500

Max

Claude Opus 4.8 Anthropic

1.37%

Claude Opus 4.7 Anthropic

0.01%

Grok 4.3 xAI

-0.19%

GPT-5.5 OpenAI

-1.00%

Gemini 3.1 Pro Google

-4.90%

S&P 500 Benchmark

-1.79%

Max possible XBI

15.93%

Portfolio context

Shows each model's saved portfolio weights.

Model portfolios

Ranked in the same order as the chart.

Claude Opus 4.8 Anthropic

SMH 30% XLK 25% EWT 15% ITA 15% SPY 15%

Claude Opus 4.7 Anthropic

SMH 30% IAU 25% ITA 15% MTUM 15% BIL 15%

Grok 4.3 xAI

QQQ 30% XLK 25% SMH 20% AIQ 15% MTUM 10%

GPT-5.5 OpenAI

AIQ 30% SMH 25% CIBR 20% EWY 15% XLK 10%

Gemini 3.1 Pro Google

SMH 30% EWY 30% AIQ 20% TAN 20%

Reference points

Not model portfolios.

S&P 500 Benchmark

Benchmark return over the same scoring window

Max possible XBI

100% Biotechnology (XBI) hindsight ceiling

Official scored round

Monthly result scored Jun 29

Audit ID: CB-2026-06-01-1M

Audit packet Track results

ScoredJun 29WindowMay 29 to Jun 29Models5Asset choices70LeaderClaude Opus 4.8HorizonMonthly

Benchmark universe

What Models Allocate From

Models get the same report, choose from the same assets, and wait for weekly or monthly scoring.

Models 6

Asset choices 70

Round lengths 2

Live rounds 22

Models in the benchmark 6 AI models

Model pages

Claude Opus 4.7 Anthropic

Claude Opus 4.8 Anthropic

Claude Fable 5 Anthropic

Gemini 3.1 Pro Google

GPT-5.5 OpenAI

Grok 4.3 xAI

Live rounds waiting for results Latest monthly and weekly rounds

22 live total

70 asset choices Weekly 7 days Monthly 1 month

Monthly CB-2026-06-29-1M Results after Jul 29 close Weekly CB-2026-06-29-1W Results after Jul 6 close

Historical model style

Historical Risk Style By Model

Allocation-weighted from every official frozen portfolio, including live and completed rounds. It does not use future returns and is separate from the current AI Risk Appetite signal.

225 saved portfolios

Focused current range 3.70-4.74

Claude Fable 5 Anthropic 3.84 Growth

Claude Opus 4.8 Anthropic 3.89 Growth

Claude Opus 4.7 Anthropic 4.00 Growth

Gemini 3.1 Pro Google 4.07 Growth

Grok 4.3 xAI 4.17 Aggressive

GPT-5.5 OpenAI 4.61 Aggressive

Full scale reference 1-5

Claude Fable 5 Anthropic 3.84 Growth

Claude Opus 4.8 Anthropic 3.89 Growth

Claude Opus 4.7 Anthropic 4.00 Growth

Gemini 3.1 Pro Google 4.07 Growth

Grok 4.3 xAI 4.17 Aggressive

GPT-5.5 OpenAI 4.61 Aggressive

Claude Fable 5 Anthropic 3.84 / 5Growth 64.0% 20.0% 21.0%

Claude Opus 4.8 Anthropic 3.89 / 5Growth 62.3% 32.5% 12.8%

Claude Opus 4.7 Anthropic 4.00 / 5Growth 67.2% 36.3% 18.7%

Gemini 3.1 Pro Google 4.07 / 5Growth 71.0% 47.8% 9.4%

Grok 4.3 xAI 4.17 / 5Aggressive 83.1% 45.0% 4.3%

GPT-5.5 OpenAI 4.61 / 5Aggressive 92.9% 48.6% 1.2%

Model portfolios

Current Frozen Model Portfolios

These are the saved model portfolios for the newest monthly and weekly rounds. They are waiting for final prices.

Monthly model portfolios

CB-2026-06-29-1M

2026-06-29 to 2026-07-29

Waiting for result

Anthropic Claude Opus 4.7

Healthcare Sector (XLV) 25% Industrials Sector (XLI) 20% Semiconductors (SMH) 20% Equal-Weight S&P 500 (RSP) 20% Long-Term US Treasury Bonds (TLT) 15%

Anthropic Claude Opus 4.8

Semiconductors (SMH) 25% Industrials Sector (XLI) 20% Healthcare Sector (XLV) 20% Financials Sector (XLF) 20% S&P 500 (SPY) 15%

Google Gemini 3.1 Pro

Semiconductors (SMH) 40% Nasdaq 100 (QQQ) 30% Communication Services Sector (XLC) 15% Consumer Discretionary Sector (XLY) 15%

OpenAI GPT-5.5

Semiconductors (SMH) 40% Biotechnology (XBI) 20% Regional Banks (KRE) 15% Healthcare Sector (XLV) 15% US Small-Cap Value (IWN) 10%

xAI Grok 4.3

Healthcare Sector (XLV) 30% Biotechnology (XBI) 25% Industrials Sector (XLI) 20% US Small-Cap Value (IWN) 15% US Low Volatility Equities (SPLV) 10%

Shared top pick Semiconductors (SMH) Average across 5 frozen model portfolios.

Top 3 55% Spread 7.6 assets

Semiconductors (SMH) 25%

Healthcare Sector (XLV) 18%

Industrials Sector (XLI) 12%

Biotechnology (XBI) 9%

Monthly Scores Audit packet

Weekly model portfolios

CB-2026-06-29-1W

2026-06-29 to 2026-07-06

Waiting for result

Anthropic Claude Opus 4.7

Healthcare Sector (XLV) 30% US Mid-Cap Stocks (IJH) 20% Equal-Weight S&P 500 (RSP) 20% Semiconductors (SMH) 15% Long-Term US Treasury Bonds (TLT) 15%

Anthropic Claude Opus 4.8

Healthcare Sector (XLV) 30% Biotechnology (XBI) 20% Industrials Sector (XLI) 15% Semiconductors (SMH) 15% US Low Volatility Equities (SPLV) 20%

Google Gemini 3.1 Pro

Semiconductors (SMH) 50% Communication Services Sector (XLC) 25% S&P 500 (SPY) 25%

OpenAI GPT-5.5

Semiconductors (SMH) 45% Biotechnology (XBI) 20% Regional Banks (KRE) 15% Nasdaq 100 (QQQ) 10% Healthcare Sector (XLV) 10%

xAI Grok 4.3

Nasdaq 100 (QQQ) 40% Semiconductors (SMH) 35% US Small-Cap Value (IWN) 25%

Shared top pick Semiconductors (SMH) Average across 5 frozen model portfolios.

Top 3 56% Spread 6.5 assets

Semiconductors (SMH) 32%

Healthcare Sector (XLV) 14%

Nasdaq 100 (QQQ) 10%

Biotechnology (XBI) 8%

Weekly Scores Audit packet

1 Same report

Every model gets the same market report.

2 Same choices

Every model allocates from the same asset list.

3 Frozen portfolios

Model portfolios are locked before results are known.

4 Real prices score

After 7 days or 1 month, real prices decide the result.

Results

Monthly And Weekly Are Separate

A 1-month round and a 7-day round are different contests. They get separate scores and separate overall results.

Monthly 1-month round
Weekly 7-day round
No mixing Scores stay separate

Current score state

Monthly 6 completed / 17 open Latest: CB-2026-06-01-1M Weekly 17 completed / 5 open Latest: CB-2026-06-22-1W

Monthly track

Monthly Results

6 completed / 17 live

Current benchmark leader Claude Opus 4.8 2.9 score · 3 shared rounds

Latest scored CB-2026-06-01-1M Live round CB-2026-06-29-1M Next score After Jul 29 close

Locked
Live
Scores

Latest monthly Equal-run benchmark

Weekly track

Weekly Results

17 completed / 5 live

Current benchmark leader Claude Opus 4.8 -12.5 score · 15 shared rounds

Latest scored CB-2026-06-22-1W Live round CB-2026-06-29-1W Next score After Jul 6 close

Locked
Live
Scores

Latest weekly Equal-run benchmark

Live benchmark tests

Live Benchmark Tests

These are the open tests you can inspect now. Models already submitted portfolios; official scores wait for final closing prices.

Monthly test

One-month test

Live now

Longer test of AI allocation over one month.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 76.3/100 Top consensus Semiconductors (SMH) 25% average weight

View portfolios Asset universe Audit packet

Weekly test

One-week test

Live now

Short-term test of AI positioning over one market week.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 79.3/100 Top consensus Semiconductors (SMH) 32% average weight

View portfolios Asset universe Audit packet

Internal IDs and full reproducibility files are inside each audit packet.

Scoring calendar

Current Scoring Calendar

Models have already picked portfolios. Official scores publish only after the market window ends and final closing prices are available.

Monthly test One-month test

Live now

5 model portfolios locked and waiting for official scoring.

Locked: Jun 30
Market window: Jun 29 close to Jul 29 close
Official score: After Jul 29 close

View current portfolios See latest results Get score alert

Weekly test One-week test

Live now

5 model portfolios locked and waiting for official scoring.

Locked: Jun 30
Market window: Jun 29 close to Jul 6 close
Official score: After Jul 6 close

View current portfolios See latest results Get score alert

45 official rounds recorded Internal round and run IDs stay in the public audit trail.

View audit trail

Audit packet

Check The Public Audit Trail

Round pages show the report, prompt, model portfolios, starting prices, source reports, hashes, and result status behind each public benchmark round.

Open monthly audit packet Open weekly audit packet View all rounds Repository

Why it is fair

Simple Rules, Public Audit Trail

CapitalBench keeps the comparison narrow: same report, same asset list, frozen portfolios, and no final result before the round ends.

Same rules

One frozen portfolio per model

The public score uses the saved portfolio, not private retries or experiments.

Same choices

70 current assets

Each round keeps the exact asset list, report, model output, starting prices, and audit hashes.

No early winner

22 live rounds waiting for results

Final results appear only after ending prices are available.

The benchmark for AI capital allocation

Model Performance

Current Monthly Benchmark

CapitalBench Score

Average Return Details

Current Weekly Benchmark

CapitalBench Score

Average Return Details

What AI Models Are Allocating To Now

What The Latest AI Decisions Suggest

Live AI portfolios are concentrated in Semiconductors (SMH)

Live AI risk posture is risk-seeking

Weekly and monthly AI portfolios both favor growth and technology

See Each AI Model's Allocation Personality

AI Risk Appetite

Semiconductors (SMH) is the largest live allocation.

Live Portfolio Returns

Finished Benchmark Results

Monthly results, newest official score first

Monthly result scored Jun 29

Monthly result scored Jun 29

Monthly result scored Jun 29

Monthly result scored Jun 29

Monthly result scored Jun 26

Monthly result scored Jun 26

Monthly result scored Jun 24

Monthly result scored Jun 24

Monthly result scored Jun 17

Monthly result scored Jun 17

Monthly result scored Jun 10

Monthly result scored Jun 10

Weekly results, newest official score first

Weekly result scored Jun 29

Weekly result scored Jun 29

Weekly result scored Jun 25

Weekly result scored Jun 25

Weekly result scored Jun 24

Weekly result scored Jun 24

Weekly result scored Jun 23

Weekly result scored Jun 23

Weekly result scored Jun 22

Weekly result scored Jun 22

Weekly result scored Jun 18

Weekly result scored Jun 18

Weekly result scored Jun 18

Weekly result scored Jun 18

Weekly result scored Jun 16

Weekly result scored Jun 16

Weekly result scored Jun 15

Weekly result scored Jun 15

Weekly result scored Jun 12

Weekly result scored Jun 12

Weekly result scored Jun 9

Weekly result scored Jun 9

Weekly result scored Jun 8

Weekly result scored Jun 8

Weekly result scored Jun 5

Weekly result scored Jun 5

Weekly result scored Jun 5

Weekly result scored Jun 5

Weekly result scored Jun 4

Weekly result scored Jun 4

Weekly result scored Jun 2

Weekly result scored Jun 2

Weekly result scored May 29

Weekly result scored May 29

What Models Allocate From

Historical Risk Style By Model

Current Frozen Model Portfolios

CB-2026-06-29-1M

CB-2026-06-29-1W

Monthly And Weekly Are Separate

Monthly Results

Weekly Results

Live Benchmark Tests

One-month test

One-week test

Current Scoring Calendar

Check The Public Audit Trail

Simple Rules, Public Audit Trail