Model Profile

DeepSeek-V3

Name: DeepSeek-V3
Rating: 0.7 (20 reviews)
Author: deepseek-ai

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: deepseek-ai/DeepSeek-V3

Author: deepseek-ai

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 4

Avg confidence: 10.4%

Evidence points: 20

Raw rows: 13

Weighted rows: 6

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 1,295,783

Intelligence Profile

Dimension Breakdown

IQ1 benchmark

63.2%*

EQ0 benchmarks

No eq benchmarks found

Insufficient data

Accuracy0 benchmarks

No accuracy benchmarks found

Insufficient data

Creativity0 benchmarks

No creativity benchmarks found

Insufficient data

Based0 benchmarks

No based benchmarks found

Insufficient data

* Low confidence — limited benchmark evidence for this dimension

1/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

LEXam Leaderboard

average_score_pct

3.1%

Normalized value 63.2% · confidence 100.0%

Strongest impact in Contract Drafting & Redlining

lexam_leaderboard.average_score_pct · Mar 31, 2026

LEXam Leaderboard

open_question_judge_score_pct

1.4%

Normalized value 68.1% · confidence 100.0%

Strongest impact in Contract redline summary

lexam_leaderboard.open_question_judge_score_pct · Mar 31, 2026

SciArena Leaderboard

rating_elo

0.6%

Normalized value 51.8% · confidence 100.0%

Strongest impact in Contract redline summary

sciarena_leaderboard.rating_elo · Apr 1, 2026

LEXam Leaderboard

mcq_accuracy_pct

0.5%

Normalized value 59.1% · confidence 100.0%

Strongest impact in Contract redline summary

lexam_leaderboard.mcq_accuracy_pct · Mar 31, 2026

Aider Polyglot Leaderboard

percent_correct_pct

0.2%

Normalized value 60.4% · confidence 100.0%

Strongest impact in Contract redline summary

aider_polyglot.percent_correct_pct · Apr 1, 2026

Some fit rows have limited benchmark evidence.

4 of 4 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

Total Measurements

Weighted Measurements

Weighted Sources

Raw Source Coverage

sciarena_leaderboard 7aider_polyglot 3lexam_leaderboard 3

Weighted Source Coverage

lexam_leaderboard 3aider_polyglot 2sciarena_leaderboard 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Contract Drafting & Redlining use_case.legal.contract_drafting	legal	7.0%	11.1%	5	LEXam Leaderboard: average_score_pct
Contract redline summary use_case.legal.contract_redline_summary	legal	6.4%	10.3%	5	LEXam Leaderboard: average_score_pct
Contract term extraction use_case.legal.contract_term_extraction	legal	6.2%	10.0%	5	LEXam Leaderboard: average_score_pct
Clause playbook check use_case.legal.playbook_clause_check	legal	6.2%	10.0%	5	LEXam Leaderboard: average_score_pct