BasedAGIBasedAGI

Model Profile

openai/gpt-4.1

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/openai/gpt-4-1

Author: openai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 23.5%

Evidence points: 178

Raw rows: 87

Weighted rows: 25

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Price / 1M tokens: $3.50 (blended 3:1)

Intelligence Profile

IQ *57%EQ *92%Accuracy *66%CreativityBased

Dimension Breakdown

IQ4 benchmarks
57.2%*
EQ1 benchmark
91.8%*
Accuracy2 benchmarks
65.8%*
Creativity0 benchmarks

No creativity benchmarks found

Insufficient data
Based0 benchmarks

No based benchmarks found

Insufficient data

* Low confidence — limited benchmark evidence for this dimension

3/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Some fit rows have limited benchmark evidence.

9 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

101

Total Measurements

87

Weighted Measurements

25

Weighted Sources

15

Raw Source Coverage

duckdb_nsql_leaderboard 12mws_vision_bench 12languagebench 10baxbench_leaderboard 9sciarena_leaderboard 7browsecomp_plus_leaderboard 4

Weighted Source Coverage

languagebench 3languagebench_translation_official 3lexam_leaderboard 3vader_leaderboard 3aider_polyglot 2duckdb_nsql_leaderboard 2

Best Use Cases for This Model

Use CaseScore
Archaic and historical translation

use_case.history.archaic_translation

26.2%
Legal translation

use_case.legal.legal_translation

24.7%
Brand voice localization

use_case.mkt.brand_voice_localization

20.9%
Historical document summarization

use_case.history.historical_doc_summarization

20.6%
Verilog/VHDL generation

use_case.eda.verilog_generation

19.6%
Metric definition workshop

use_case.data.metric_definition_workshop

17.8%
Integration test generation

use_case.dev.integration_tests

16.6%
Grammar and writing coach

use_case.lang.grammar_coach

16.6%
Data quality assistant

use_case.data.data_quality_assistant

16.3%
Translation and localization

use_case.business.translation_localization

16.0%
Contract term extraction

use_case.legal.contract_term_extraction

15.9%
Clause playbook check

use_case.legal.playbook_clause_check

15.9%