BasedAGIBasedAGI

Model Profile

anthropic/claude-sonnet-4-5-20250929-thinking

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/anthropic/claude-sonnet-4-5-20250929-thinking

Author: anthropic

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 26.6%

Evidence points: 211

Raw rows: 465

Weighted rows: 20

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Intelligence Profile

IQ73%EQAccuracyCreativityBased

Dimension Breakdown

IQ24 benchmarks
73.1%
EQ0 benchmarks

No eq benchmarks found

Insufficient data
Accuracy0 benchmarks

No accuracy benchmarks found

Insufficient data
Creativity0 benchmarks

No creativity benchmarks found

Insufficient data
Based0 benchmarks

No based benchmarks found

Insufficient data

1/5 dimensions scored · Last updated Apr 19, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Some fit rows have limited benchmark evidence.

5 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

120

Total Measurements

465

Weighted Measurements

20

Weighted Sources

14

Raw Source Coverage

vals_mmlu_pro 60vals_mgsm 48vals_finance_agent 40vals_multimodal_index 32corpfin_taxeval_public 28vals_medqa 28

Weighted Source Coverage

vals_finance_agent 5vals_corp_fin_v2 3hle_leaderboard 1vals_case_law_v2 1vals_gpqa 1vals_lcb 1

Best Use Cases for This Model

Use CaseScore
Thesis red teaming

use_case.fin.thesis_red_team

27.0%
Accounts payable invoice extraction (text)

use_case.fin.ap_invoice_extraction

25.7%
Earnings call synthesis

use_case.fin.earnings_call_synthesis

24.4%
Transaction anomaly narrative

use_case.fin.transaction_anomaly_narrative

23.9%
KYC profile synthesis

use_case.fin.kyc_profile_synthesis

23.0%
AML alert triage

use_case.fin.aml_alert_triage

23.0%
Filings summarization (10-K/10-Q)

use_case.fin.filings_summarization

21.3%
Quant research code generation

use_case.fin.alpha_research_codegen

18.8%
Simulation setup assistant

use_case.eng.simulation_setup_assistant

17.5%
Component selection assistant

use_case.eng.component_selection

17.0%
Contract Drafting & Redlining

use_case.legal.contract_drafting

15.4%
Literature synthesis with citations

use_case.bio.literature_synthesis

15.3%