Model Profile

DeepSeek-V3.1

Name: DeepSeek-V3.1
Rating: 0.8 (56 reviews)
Author: deepseek-ai

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: deepseek-ai/DeepSeek-V3.1

Author: deepseek-ai

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 11.0%

Evidence points: 56

Raw rows: 23

Weighted rows: 13

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 123,230

Intelligence Profile

Dimension Breakdown

IQ0 benchmarks

No iq benchmarks found

Insufficient data

EQ0 benchmarks

No eq benchmarks found

Insufficient data

Accuracy1 benchmark

82.9%*

Creativity0 benchmarks

No creativity benchmarks found

Insufficient data

Based0 benchmarks

No based benchmarks found

Insufficient data

* Low confidence — limited benchmark evidence for this dimension

1/5 dimensions scored · Last updated Apr 2, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Vectara HHEM Leaderboard

overall_hallucination_error_pct

3.3%

Normalized value 82.9% · confidence 100.0%

Strongest impact in Knowledge base Q&A (fast, no citations)

vectara_hhem_leaderboard.overall_hallucination_error_pct · Apr 1, 2026

Vectara HHEM Leaderboard

overall_answer_rate_pct

1.9%

Normalized value 85.3% · confidence 100.0%

Strongest impact in Agent-assist reply suggestions

vectara_hhem_leaderboard.overall_answer_rate_pct · Apr 1, 2026

Vectara HHEM Leaderboard

science_hallucination_error_pct

1.7%

Normalized value 92.2% · confidence 100.0%

Strongest impact in Literature synthesis with citations

vectara_hhem_leaderboard.science_hallucination_error_pct · Apr 1, 2026

Vectara HHEM Leaderboard

law_hallucination_error_pct

1.4%

Normalized value 91.2% · confidence 100.0%

Strongest impact in Contract Q&A (RAG grounded)

vectara_hhem_leaderboard.law_hallucination_error_pct · Apr 1, 2026

SimpleQA Verified

simpleqa_verified_score_pct

0.9%

Normalized value 26.1% · confidence 100.0%

Strongest impact in Knowledge base Q&A (fast, no citations)

simpleqa_verified.simpleqa_verified_score_pct · Apr 1, 2026

Vectara HHEM Leaderboard

finance_hallucination_error_pct

0.9%

Normalized value 63.6% · confidence 100.0%

Strongest impact in Thesis red teaming

vectara_hhem_leaderboard.finance_hallucination_error_pct · Apr 1, 2026

Some fit rows have limited benchmark evidence.

12 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

Total Measurements

Weighted Measurements

Weighted Sources

Raw Source Coverage

vectara_hhem_leaderboard 21simpleqa_verified 2

Weighted Source Coverage

vectara_hhem_leaderboard 12simpleqa_verified 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Knowledge base Q&A (fast, no citations) use_case.business.kb_qna_fast	business_productivity	8.2%	12.6%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Literature synthesis with citations use_case.bio.literature_synthesis	biomed_science	7.9%	11.3%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Cross-paper contradiction analysis use_case.bio.paper_contradictions	biomed_science	7.9%	11.3%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Contract Q&A (RAG grounded) use_case.legal.contract_qna	legal	7.6%	11.0%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Regulatory summary use_case.legal.regulatory_summary	legal	7.4%	10.8%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Knowledge base Q&A (with citations) use_case.business.kb_qna_with_citations	business_productivity	7.3%	11.2%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Contract redline summary use_case.legal.contract_redline_summary	legal	7.1%	10.3%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Agent-assist reply suggestions use_case.cx.agent_assist_replies	customer_experience	7.1%	11.0%	3	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Support dialogue agent use_case.cx.support_dialogue_agent	customer_experience	7.0%	10.9%	3	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Clause playbook check use_case.legal.playbook_clause_check	legal	6.9%	10.0%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Contract term extraction use_case.legal.contract_term_extraction	legal	6.9%	10.0%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct
Thesis red teaming use_case.fin.thesis_red_team	finance	6.8%	11.3%	5	Vectara HHEM Leaderboard: overall_hallucination_error_pct