Model Profile

GPT-4.1-nano-2025-04-14

Name: GPT-4.1-nano-2025-04-14
Rating: 1.1 (141 reviews)
Author: openai

External Benchmark Shadowexternal_benchmark_shadowpublic

4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/openai/gpt-4-1-nano-2025-04-14

Author: openai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 21.1%

Evidence points: 141

Raw rows: 375

Weighted rows: 26

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Intelligence Profile

Dimension Breakdown

IQ12 benchmarks

39.2%

EQ0 benchmarks

No eq benchmarks found

Insufficient data

Accuracy0 benchmarks

No accuracy benchmarks found

Insufficient data

Creativity0 benchmarks

No creativity benchmarks found

Insufficient data

Based0 benchmarks

No based benchmarks found

Insufficient data

1/5 dimensions scored · Last updated Apr 21, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Galileo Agent Leaderboard v2

Avg AC

3.1%

Normalized value 47.8% · confidence 100.0%

Strongest impact in Terraform generation

galileo_agent_v2.avg_ac · Apr 1, 2026

OpenVLM OCRBench Official

ocrbench_score_pct

2.5%

Normalized value 74.6% · confidence 100.0%

Strongest impact in Socratic tutor

openvlm_ocrbench_official.ocrbench_score_pct · Apr 1, 2026

OpenVLM MTVQA Official

mtvqa_score_pct

1.9%

Normalized value 74.0% · confidence 100.0%

Strongest impact in Socratic tutor

openvlm_mtvqa_official.mtvqa_score_pct · Apr 1, 2026

Vals Tax Eval v2

overall_accuracy_pct

1.3%

Normalized value 53.6% · confidence 100.0%

Strongest impact in Accounts payable invoice extraction (text)

vals_tax_eval_v2.overall_accuracy_pct · Mar 31, 2026

Galileo Agent Leaderboard v2

Banking AC

1.3%

Normalized value 59.2% · confidence 100.0%

Strongest impact in Accounts payable invoice extraction (text)

galileo_agent_v2.banking_ac · Apr 1, 2026

Vals Mortgage Tax

overall_accuracy_pct

1.1%

Normalized value 64.1% · confidence 100.0%

Strongest impact in Accounts payable invoice extraction (text)

vals_mortgage_tax.overall_accuracy_pct · Mar 31, 2026

Some fit rows have limited benchmark evidence.

10 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

122

Total Measurements

375

Weighted Measurements

Weighted Sources

Raw Source Coverage

vals_mmlu_pro 60vals_mgsm 48docvqa_leaderboard 34galileo_agent_v2 34corpfin_taxeval_public 28vals_medqa 28

Weighted Source Coverage

galileo_agent_v2 10vals_corp_fin_v2 3bigcodebench_official 1icelandic_llm_leaderboard 1openvlm_chartqa_human_official 1openvlm_mtvqa_official 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Accounts payable invoice extraction (text) use_case.fin.ap_invoice_extraction	finance	11.0%	27.1%	14	Vals Tax Eval v2: overall_accuracy_pct
Thesis red teaming use_case.fin.thesis_red_team	finance	9.9%	26.1%	14	Vals Tax Eval v2: overall_accuracy_pct
Terraform generation use_case.sre.iac_terraform	devops_sre	9.1%	21.5%	9	Galileo Agent Leaderboard v2: Avg AC
Kubernetes manifest generation use_case.sre.iac_k8s	devops_sre	9.1%	21.5%	9	Galileo Agent Leaderboard v2: Avg AC
Config debugging use_case.sre.config_debugging	devops_sre	9.1%	21.5%	9	Galileo Agent Leaderboard v2: Avg AC
Earnings call synthesis use_case.fin.earnings_call_synthesis	finance	8.9%	23.6%	14	Vals Tax Eval v2: overall_accuracy_pct
Simulation setup assistant use_case.eng.simulation_setup_assistant	engineering	8.8%	19.6%	10	Galileo Agent Leaderboard v2: Avg AC
Quant research code generation use_case.fin.alpha_research_codegen	finance	8.8%	20.8%	15	Vals Tax Eval v2: overall_accuracy_pct
Transaction anomaly narrative use_case.fin.transaction_anomaly_narrative	finance	8.7%	23.1%	14	Vals Tax Eval v2: overall_accuracy_pct
Socratic tutor use_case.edu.socratic_tutor	education	8.5%	15.2%	11	OpenVLM OCRBench Official: ocrbench_score_pct
Lesson plan generator use_case.edu.lesson_plan_generator	education	8.5%	15.2%	11	OpenVLM OCRBench Official: ocrbench_score_pct
Job description drafting use_case.hr.job_description_drafting	hr_recruiting	8.4%	17.8%	11	OpenVLM OCRBench Official: ocrbench_score_pct