BasedAGIBasedAGI

Model Profile

gpt-4o

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/openai/gpt-4o

Author: openai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 36.8%

Evidence points: 176

Raw rows: 221

Weighted rows: 39

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Price / 1M tokens: $0.26 (blended 3:1)

Intelligence Profile

IQ *61%EQ *87%Accuracy *70%Creativity *74%Based *54%

Dimension Breakdown

IQ5 benchmarks
60.6%*
EQ4 benchmarks
86.6%*
Accuracy3 benchmarks
69.6%*
Creativity1 benchmark
74.3%*
Based2 benchmarks
53.5%*

* Low confidence — limited benchmark evidence for this dimension

5/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Coverage Diagnostics

actively scored

Use-Case Scores

129

Total Measurements

221

Weighted Measurements

39

Weighted Sources

21

Raw Source Coverage

mega_bench 40testeval_leaderboard 18mmlu_pro_leaderboard 15duckdb_nsql_leaderboard 12jsonschemabench_leaderboard 12medhelm_leaderboard 12

Weighted Source Coverage

crmarena_leaderboard 4medhelm_leaderboard 4mega_bench 4sonar_java_quality 4lexam_leaderboard 3buildarena_readme 2

Best Use Cases for This Model

Use CaseScore
Metric definition workshop

use_case.data.metric_definition_workshop

30.3%
Data quality assistant

use_case.data.data_quality_assistant

24.2%
Personalized sales outreach

use_case.mkt.sales_outreach_personalized

23.8%
Ad copy variants

use_case.mkt.ad_copy_variants

23.8%
SQL debugging

use_case.data.sql_debugging

23.6%
Insight mining from text corpora

use_case.data.insight_mining

22.8%
Executive brief from metrics

use_case.data.exec_brief_from_metrics

21.8%
Social post generation

use_case.mkt.social_post_generation

20.8%
Campaign brief

use_case.mkt.campaign_brief

20.8%
Product positioning and messaging

use_case.mkt.product_positioning

20.8%
Text-to-SQL analyst assistant

use_case.data.text_to_sql

20.3%
Screenplay scene writing

use_case.creative.screenplay_scene

19.7%