BasedAGIBasedAGI

Model Profile

deepseek-v3

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/deepseek-ai/deepseek-v3

Author: deepseek-ai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 20.5%

Evidence points: 124

Raw rows: 124

Weighted rows: 34

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Intelligence Profile

IQ *42%EQAccuracy *43%CreativityBased *72%

Dimension Breakdown

IQ3 benchmarks
41.8%*
EQ0 benchmarks

No eq benchmarks found

Insufficient data
Accuracy1 benchmark
43.4%*
Creativity0 benchmarks

No creativity benchmarks found

Insufficient data
Based2 benchmarks
72.1%*

* Low confidence — limited benchmark evidence for this dimension

3/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Some fit rows have limited benchmark evidence.

11 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

135

Total Measurements

124

Weighted Measurements

34

Weighted Sources

12

Raw Source Coverage

galileo_agent_v2 34vectara_hhem_leaderboard 21mmlu_pro_leaderboard 15artifactsbenchmark_leaderboard 11baxbench_leaderboard 9bigcodebench_official 8

Weighted Source Coverage

vectara_hhem_leaderboard 12galileo_agent_v2 10bigcodebench_official 3baxbench_leaderboard 1bird_critic 1icelandic_llm_leaderboard 1

Best Use Cases for This Model

Use CaseScore
Simulation setup assistant

use_case.eng.simulation_setup_assistant

14.6%
Component selection assistant

use_case.eng.component_selection

14.4%
Verilog/VHDL generation

use_case.eda.verilog_generation

13.8%
Integration test generation

use_case.dev.integration_tests

13.7%
Litigation risk memo

use_case.ins.litigation_risk_memo

13.4%
Thesis red teaming

use_case.fin.thesis_red_team

13.2%
Text-to-SQL analyst assistant

use_case.data.text_to_sql

12.9%
Contract Q&A (RAG grounded)

use_case.legal.contract_qna

12.6%
Knowledge base Q&A (fast, no citations)

use_case.business.kb_qna_fast

12.4%
Regulatory summary

use_case.legal.regulatory_summary

12.3%
Runbook step assistant

use_case.sre.runbook_steps

12.2%
Knowledge base Q&A (with citations)

use_case.business.kb_qna_with_citations

12.0%