Model Profile

Qwen3-235B-A22B-Thinking-2507

Name: Qwen3-235B-A22B-Thinking-2507
Rating: 0.9 (54 reviews)
Author: Qwen

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: Qwen/Qwen3-235B-A22B-Thinking-2507

Author: Qwen

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 14.5%

Evidence points: 54

Raw rows: 114

Weighted rows: 15

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 50,867

Intelligence Profile

Dimension Breakdown

IQ2 benchmarks

74.0%*

EQ0 benchmarks

No eq benchmarks found

Insufficient data

Accuracy1 benchmark

No accuracy benchmarks found

Insufficient data

Creativity2 benchmarks

35.4%*

Based1 benchmark

29.0%*

* Low confidence — limited benchmark evidence for this dimension

3/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Galileo Agent Leaderboard v2

Avg TSQ

3.8%

Normalized value 76.9% · confidence 100.0%

Strongest impact in Social post generation

galileo_agent_v2.avg_tsq · Apr 1, 2026

Galileo Agent Leaderboard v2

Insurance TSQ

2.5%

Normalized value 80.0% · confidence 100.0%

Strongest impact in FNOL structuring

galileo_agent_v2.insurance_tsq · Apr 1, 2026

Galileo Agent Leaderboard v2

Insurance AC

1.9%

Normalized value 41.8% · confidence 100.0%

Strongest impact in FNOL structuring

galileo_agent_v2.insurance_ac · Apr 1, 2026

SimpleQA Verified

simpleqa_verified_score_pct

1.8%

Normalized value 67.0% · confidence 100.0%

Strongest impact in Thesis red teaming

simpleqa_verified.simpleqa_verified_score_pct · Apr 1, 2026

SciArena Leaderboard

rating_elo

1.5%

Normalized value 62.5% · confidence 100.0%

Strongest impact in Route alternatives planning

sciarena_leaderboard.rating_elo · Apr 1, 2026

Galileo Agent Leaderboard v2

Banking AC

1.2%

Normalized value 63.3% · confidence 100.0%

Strongest impact in Thesis red teaming

galileo_agent_v2.banking_ac · Apr 1, 2026

Some fit rows have limited benchmark evidence.

12 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

Total Measurements

114

Weighted Measurements

Weighted Sources

Raw Source Coverage

ugi_main 60galileo_agent_v2 34sciarena_leaderboard 7livebench_official 6openrouter_models 3simpleqa_verified 2

Weighted Source Coverage

galileo_agent_v2 10ugi_main 3sciarena_leaderboard 1simpleqa_verified 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Claims summary use_case.ins.claims_summary	insurance	9.5%	16.7%	5	Galileo Agent Leaderboard v2: Insurance TSQ
Campaign brief use_case.mkt.campaign_brief	marketing_sales	9.2%	15.7%	4	Galileo Agent Leaderboard v2: Avg TSQ
Product positioning and messaging use_case.mkt.product_positioning	marketing_sales	9.2%	15.7%	4	Galileo Agent Leaderboard v2: Avg TSQ
Social post generation use_case.mkt.social_post_generation	marketing_sales	9.2%	15.7%	4	Galileo Agent Leaderboard v2: Avg TSQ
Litigation risk memo use_case.ins.litigation_risk_memo	insurance	8.6%	14.6%	6	Galileo Agent Leaderboard v2: Insurance TSQ
Ad copy variants use_case.mkt.ad_copy_variants	marketing_sales	8.6%	14.9%	4	Galileo Agent Leaderboard v2: Avg TSQ
Personalized sales outreach use_case.mkt.sales_outreach_personalized	marketing_sales	8.6%	14.9%	4	Galileo Agent Leaderboard v2: Avg TSQ
FNOL structuring use_case.ins.fnol_structuring	insurance	8.5%	15.0%	5	Galileo Agent Leaderboard v2: Insurance TSQ
Route alternatives planning use_case.sc.route_alternatives_planning	supply_chain	8.0%	12.7%	3	Galileo Agent Leaderboard v2: Avg TSQ
Thesis red teaming use_case.fin.thesis_red_team	finance	8.0%	13.9%	7	SimpleQA Verified: simpleqa_verified_score_pct
Executive briefing use_case.business.exec_briefing	business_productivity	7.9%	12.2%	4	Galileo Agent Leaderboard v2: Avg TSQ
Search query rewriting use_case.business.search_query_rewrite	business_productivity	7.9%	12.2%	4	Galileo Agent Leaderboard v2: Avg TSQ