Model Profile
Qwen2-72B-Instruct
Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.
Identity
ID: Qwen/Qwen2-72B-Instruct
Author: Qwen
Origin: huggingface_catalog
Arch: unknown
Benchmark Coverage
Scored use cases: 2
Avg confidence: 11.8%
Evidence points: 11
Raw rows: 39
Weighted rows: 8
Catalog Metadata
Parameters: unknown
Context window: 4096
Downloads: 34,779
Intelligence Profile
Dimension Breakdown
No accuracy benchmarks found
No creativity benchmarks found
No based benchmarks found
* Low confidence — limited benchmark evidence for this dimension
2/5 dimensions scored · Last updated Apr 21, 2026
Benchmark Signals
Click through to the benchmark source behind this model profile.
LM Arena Leaderboard Table (Latest)
MT-bench (score)
Normalized value 97.0% · confidence 100.0%
Strongest impact in Historical document summarization
lmarena_table_latest.mt_bench_score · Apr 1, 2026
EQ-Bench Leaderboard
eq_bench_score
Normalized value 90.7% · confidence 100.0%
Strongest impact in Archaic and historical translation
eq_bench.eq_bench_score · Apr 1, 2026
Icelandic LLM Leaderboard
Average ⬆️
Normalized value 26.5% · confidence 100.0%
Strongest impact in Archaic and historical translation
icelandic_llm_leaderboard.average · Mar 31, 2026
LM Arena Hard Auto v0.1
score
Normalized value 54.9% · confidence 100.0%
Strongest impact in Archaic and historical translation
lmarena_arena_hard_v01.score · Apr 1, 2026
Aider Code Editing Leaderboard
percent_correct_pct
Normalized value 55.2% · confidence 100.0%
Strongest impact in Archaic and historical translation
aider_code_editing.percent_correct_pct · Apr 1, 2026
Multilingual MMLU Benchmark
mmmlu
Normalized value 0.5% · confidence 100.0%
Strongest impact in Archaic and historical translation
multilingual_mmlu_leaderboard.mmmlu · Apr 1, 2026
Some fit rows have limited benchmark evidence.
2 of 2 scored use cases have low confidence or thin contributor coverage.
Coverage Diagnostics
actively scoredUse-Case Scores
2
Total Measurements
39
Weighted Measurements
8
Weighted Sources
6
Raw Source Coverage
Weighted Source Coverage
Best Use Cases for This Model
| Use Case | Score |
|---|---|
| Historical document summarization use_case.history.historical_doc_summarization | 3.2% |
| Archaic and historical translation use_case.history.archaic_translation | 2.0% |