BasedAGIBasedAGI

Find the right model for any task

100+ use cases ranked by benchmark evidence โ€” not opinions. Filter by domain, drill into any use case to see which model ranks first and why.

Featured Use Cases

15 curated
Finance

Earnings call synthesis

Summarize earnings calls into key points, tone, and risks.

๐Ÿฅ‡gemini-3.1-pro-preview
44.1%
89% quality
Creative

NPC dialogue

Low-latency in-character dialogue suitable for games.

๐Ÿฅ‡gemini-3-pro-preview
44.2%
85% quality
Adult

Adult ERP roleplay (explicit)

Explicit adult roleplay with boundary adherence and persona memory.

๐Ÿฅ‡Grok-4-0709
49.5%
85% quality
DevOps

Log triage

Interpret logs and propose safe diagnostic steps.

No stable winner yet
84% quality
Productivity

Knowledge base Q&A (with citations)

Answer questions grounded in an internal KB, with evidence.

No stable winner yet
84% quality
Productivity

Document summarization

Summarize long business documents into scannable outputs.

No stable winner yet
83% quality
Legal

Contract term extraction

Extract key terms into structured fields with clause references.

No stable winner yet
83% quality
CX

Support bot (RAG grounded)

Support chatbot grounded in docs with optional citations and escalation.

No stable winner yet
83% quality
Developer

Code generation

Generate correct, secure code from requirements.

No stable winner yet
82% quality
Marketing

Ad copy variants

Generate diverse headline/CTA variants under strict constraints.

No stable winner yet
82% quality
Creative

Long-form story co-author

Generate and refine long-form fiction with continuity.

No stable winner yet
82% quality
Developer

Debugging assistant

Localize bugs and propose fixes with explanations.

No stable winner yet
81% quality
Risk & Eval

Prompt injection resistance (eval)

Measure resistance to prompt injection in RAG and tool settings.

No stable winner yet
81% quality
Data

Text-to-SQL analyst assistant

Convert questions into SQL and explain the query.

No stable winner yet
81% quality
Healthcare

Clinical note drafting

Summarize encounters into structured notes for clinician review.

No stable winner yet
81% quality

All Use Cases

100 indexed ยท 19 high-confidence
Use CaseBest Score
Thesis red teaming

Stress-test an investment thesis with counterarguments and risk.

48.9%
Earnings call synthesis

Summarize earnings calls into key points, tone, and risks.

44.1%
Transaction anomaly narrative

Summarize anomalies into hypotheses, evidence, and follow-up actions.

43.3%
Casual chat companion

Engaging conversation with consistent tone and context.

50.1%
Life coaching and goal planning

Goal setting, habit planning, and accountability check-ins.

50.1%
Tarot-style reading

Symbolic, personalized readings with consistent persona.

50.1%
Mindfulness and meditation scripts

Generate calming scripts and exercises tailored to a user's context.

48.5%
Empathetic support chat

Supportive conversation with strong boundaries and safe escalation.

48.7%
Accounts payable invoice extraction (text)

Extract structured fields from invoices/receipts for AP workflows.

38.8%
AML alert triage

Triage AML alerts into severity, rationale, and next actions.

41.5%
KYC profile synthesis

Turn identity docs and notes into a structured KYC profile.

41.5%
Filings summarization (10-K/10-Q)

Summarize filings with conservative factuality and risk highlights.

38.5%
Adult erotica (long-form, explicit)

Long-form explicit erotica with controllable style and strict boundaries.

43.6%
Component selection assistant

Recommend components under constraints with evidence and tradeoffs.

35.5%
Quant research code generation

Generate backtest or analysis code from trading hypotheses.

34.2%
Interactive fiction / DM

Run interactive fiction with state tracking and user agency.

44.2%
NPC dialogue

Low-latency in-character dialogue suitable for games.

44.2%
SFW roleplay and simulation

Roleplay/simulations for learning or entertainment with state tracking.

45.9%
Adult ERP roleplay (explicit)

Explicit adult roleplay with boundary adherence and persona memory.

49.5%
Cross-paper contradiction analysis

Identify contradictions and uncertainty across papers with citations.

Provisional
Literature synthesis with citations

Synthesize papers and guidelines with citations and uncertainty.

Provisional
Knowledge base Q&A (fast, no citations)

Answer KB questions grounded in retrieved text without citations.

Provisional
Runbook step assistant

Suggest safe runbook steps and escalation points grounded in docs.

Provisional
Contract Drafting & Redlining

Drafting, reviewing, and suggesting edits to legal contracts and agreements.

Provisional
Litigation risk memo

Summarize a claim into litigation risk drivers and mitigation steps.

Provisional
Simulation setup assistant

Turn design requirements into simulation setup checklists and boundary notes.

Provisional
Log triage

Interpret logs and propose safe diagnostic steps.

Provisional
Contract Q&A (RAG grounded)

Answer contract questions grounded in the actual contract text.

Provisional
Knowledge base Q&A (with citations)

Answer questions grounded in an internal KB, with evidence.

Provisional
Regulatory summary

Summarize and compare regulatory text with conservative interpretation.

Provisional
HR policy Q&A

Answer HR policy questions grounded in authoritative text.

Provisional
Disinformation and manipulation resistance (eval)

Measure refusal and safe handling of deceptive content generation requests.

Provisional
Document summarization

Summarize long business documents into scannable outputs.

Provisional
Political risk brief

Summarize key developments into risks, scenarios, and actions.

Provisional
Agent-assist reply suggestions

Draft replies for human agents with tone and policy constraints.

Provisional
Decision memo

Recommend a decision with options, constraints, and risks.

Provisional
Executive briefing

Turn raw notes into a short executive brief with risks and actions.

Provisional
Search query rewriting

Rewrite queries into higher-recall search queries and filters.

Provisional
Contract redline summary

Summarize material changes between contract versions with clause refs.

Provisional
Support dialogue agent

Multi-turn support conversations with escalation and policy awareness.

Provisional
Clause playbook check

Check extracted terms against a playbook and flag deviations.

Provisional
Contract term extraction

Extract key terms into structured fields with clause references.

Provisional
Support bot (RAG grounded)

Support chatbot grounded in docs with optional citations and escalation.

Provisional
SQL debugging

Diagnose and fix SQL queries for correctness and performance.

Provisional
Policy wording comparison

Compare policy wording against a standard and flag material differences.

Provisional
Operator support chat

Real-time operator assistant with grounded troubleshooting and escalation.

Provisional
Maintenance RCA memo

Turn logs and notes into a maintenance root cause analysis.

Provisional
Manuals Q&A (RAG grounded)

Answer operator questions grounded in technical manuals and runbooks.

Provisional
Social listening brief

Summarize social chatter into themes, risks, and recommendations.

Provisional
Codebase onboarding brief

Summarize a repository's architecture, modules, and conventions.

Provisional
Patient education bot (RAG grounded)

Answer patient FAQ using trusted sources with cautious wording.

Provisional
Disruption monitoring brief

Summarize disruptions into risk, options, and recommendations.

Provisional
Supplier risk monitoring

Track supplier risk signals from multi-source text and summarize actions.

Provisional
Narrative tracking

Track narratives across multi-lingual sources and flag contradictions.

Provisional
Campaign brief

Draft a campaign brief with positioning, audience, and channels.

Provisional
Product positioning and messaging

Develop positioning, value props, and message pillars with tradeoffs.

Provisional
Social post generation

Generate short channel-specific social posts and variations.

Provisional
Landing page copy

Draft landing pages with clear positioning and structure.

Provisional
Autonomous Coding Agent

End-to-end autonomous software engineering: reading issues, writing code, running tests, submitting PRs.

Provisional
Language conversation partner

Conversational practice with gentle corrections and explanations.

Provisional
Medical coding support (suggestions)

Extract coding-relevant facts and suggest codes for human review.

Provisional
Poetry and lyrics

Generate poems and lyrics with style control and variation.

Provisional
Screenplay scene writing

Write screenplay scenes with formatting, pacing, and strong dialogue.

Provisional
Code generation

Generate correct, secure code from requirements.

Provisional
CAD scripting helper

Generate and debug CAD automation scripts and parametric geometry code.

Provisional
PR crisis response draft

Draft a conservative public statement and internal talking points.

Provisional
Customer feedback theme mining

Extract themes and trends from reviews, tickets, and surveys.

Provisional
Title document search assistant (RAG grounded)

Navigate and answer questions across a corpus of property documents.

Provisional
Config debugging

Diagnose and patch YAML/JSON/TOML configs with minimal diffs.

Provisional
Kubernetes manifest generation

Generate K8s manifests with safe defaults and probes.

Provisional
Terraform generation

Generate Terraform IaC with correct resources and safe defaults.

Provisional
Metric definition workshop

Turn ambiguous KPI definitions into precise, measurable specs.

Provisional
Archaic and historical translation

Translate older or domain-specific language into modern equivalents.

Provisional
Refactoring assistant

Refactor code safely while preserving behavior and improving clarity.

Provisional
Ad copy variants

Generate diverse headline/CTA variants under strict constraints.

Provisional
Personalized sales outreach

Draft outbound emails/DMs personalized to a prospect persona.

Provisional
Dashboard narratives

Generate weekly KPI narratives and investigation suggestions.

Provisional
Grammar and writing coach

Correct grammar and explain fixes at the learner's level.

Provisional
Security incident triage

Triage security incidents from alerts/logs into impact and next steps.

Provisional
Support FAQ bot

Answer common support questions with safe troubleshooting steps.

Provisional
IDE code completion

Fast local-context code completion and small snippet generation.

Provisional
Vendor contract summary (procurement)

Summarize vendor contracts into key terms, risks, and deviations.

Provisional
Long-form story co-author

Generate and refine long-form fiction with continuity.

Provisional
Verilog/VHDL generation

Generate RTL code and testbenches from functional specs.

Provisional
Fraud signal summary

Summarize potential fraud indicators with conservative evidence framing.

Provisional
Malware analysis report (defensive)

Explain suspicious code and produce a defensive analysis report.

Provisional
PR review agent

Review diffs for correctness, security, and maintainability.

Provisional
Crisis escalation protocol (eval)

Measure safe crisis escalation behavior under the selected policy.

Provisional
Jailbreak resistance (eval)

Measure robustness to adversarial prompts that attempt to bypass policy.

Provisional
Overrefusal (eval)

Measure how often benign requests are incorrectly refused.

Provisional
Refusal profile (eval)

Measure refusal/overrefusal rates across predefined categories.

Provisional
Scam and social engineering resistance (eval)

Measure refusal and safe handling of deception/scam requests.

Provisional
Agentic bug fixing

Agentic loop that reproduces, fixes, and validates with tests.

Provisional
Debugging assistant

Localize bugs and propose fixes with explanations.

Provisional
Cross-lingual summary

Summarize a document in one language into another language.

Provisional
Prompt injection resistance (eval)

Measure resistance to prompt injection in RAG and tool settings.

Provisional
Spam filtering and classification

Detect spam and low-quality messages for routing and moderation.

Provisional
Toxicity moderation routing

Classify abusive content for moderation and escalation.

Provisional
Legal translation

Translate legal text with terminology consistency and format safety.

Provisional
Agentic incident response

Agentic tool-using workflow for incident triage and remediation planning.

Provisional