Mindfulness and meditation scripts
Generate calming scripts and exercises tailored to a user's context.
Best for this use case
gemini-3-pro-preview
Strong on BFCL Memory Official Memory Acc and BFCL Multi-turn Official Multi Turn Acc
48.5%
Best benchmark score
58.5%
Confidence
All ranked models โ top 3
Ranked Models
30
Evidence Quality
88%
Evidence Points
24
Top Signal
BFCL Memory Official: Memory Acc
All Ranked Models
| Rank | Model | Score |
|---|---|---|
| ๐ฅ | gemini-3-pro-preview Strong on BFCL Memory Official Memory Acc and BFCL Multi-turn Official Multi Turn Acc | 48.5% |
| ๐ฅ | Grok-4-0709 Strong on BFCL Memory Official Memory Acc and BFCL Relevance Detection Official Relevance Detection | 48.1% |
| ๐ฅ | grok-4-1-fast-reasoning Strong on BFCL Memory Official Memory Acc and BFCL Multi-turn Official Multi Turn Acc | 44.7% |
| #4 | GLM-4.6 Strong on BFCL Memory Official Memory Acc and BFCL Multi-turn Official Multi Turn Acc | 41.9% |
| #5 | o3-20250416 Strong on BFCL Memory Official Memory Acc and BFCL Relevance Detection Official Relevance Detection | 40.5% |
| #6 | gpt-4.1-20250414 Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Memory Official Memory Acc | 35.4% |
| #7 | Kimi-K2-Instruct Strong on BFCL Multi-turn Official Multi Turn Acc and BFCL Memory Official Memory Acc | 35.1% |
| #8 | o4-mini Strong on BFCL Memory Official Memory Acc and BFCL Relevance Detection Official Relevance Detection | 33.5% |
| #9 | grok-4-1-fast-non-reasoning Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Multi-turn Official Multi Turn Acc | 32.8% |
| #10 | gpt-5.2-2025-12-11 Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Multi-turn Official Multi Turn Acc | 31.8% |
| #12 | gemini-2.5-flash Strong on BFCL Memory Official Memory Acc and BFCL Relevance Detection Official Relevance Detection | 30.1% |
| #18 | claude-opus-4-5-20251101 Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Relevance Detection Official Irrelevance Detection | 25.9% |
| #24 | Arch-Agent-32B Strong on BFCL Multi-turn Official Multi Turn Acc and BFCL Relevance Detection Official Relevance Detection | 23.0% |
| #28 | Llama 3.3 70B Instruct Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Multi-turn Official Multi Turn Acc | 21.6% |
| #44 | gpt-5-2025-08-07 Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 18.0% |
| #46 | Llama-4-Scout-17B-16E-Instruct Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Memory Official Memory Acc | 17.9% |
| #50 | gemini-2.5-pro Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 17.4% |
| #52 | claude-sonnet-4 Strong on UGI Leaderboard Writing โ๏ธ and Galileo Agent Leaderboard v2 Avg AC | 17.2% |
| #54 | gemini-2.5-flash-lite Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Relevance Detection Official Irrelevance Detection | 17.2% |
| #55 | gemini-3.1-pro-preview Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 17.0% |
| #61 | Arch-Agent-3B Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Multi-turn Official Multi Turn Acc | 16.5% |
| #62 | Arch-Agent-1.5B Strong on BFCL Relevance Detection Official Relevance Detection and BFCL Multi-turn Official Multi Turn Acc | 16.1% |
| #66 | gpt-5.4-2026-03-05 Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 14.6% |
| #68 | claude-sonnet-4.6 Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 14.3% |
| #71 | gemini-3-flash-preview Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 14.2% |
| #75 | kimi-k2.5-thinking Strong on UGI Leaderboard Entertainment and UGI Leaderboard Writing โ๏ธ | 13.6% |
| #83 | gpt-5.1-2025-11-13 Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 12.7% |
| #85 | qwen-2.5-72b-instruct Strong on EQ-Bench Leaderboard judgemark_score and Galileo Agent Leaderboard v2 Avg AC | 12.4% |
| #91 | grok-4-fast-reasoning Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 12.0% |
| #95 | Kimi K2 Thinking Strong on UGI Leaderboard Writing โ๏ธ and UGI Leaderboard Entertainment | 11.4% |
Compare Models
โถRanking diagnostics & missing models
Source lift
Ranked
45
Sources
8
Quality
Good
UGI Leaderboard
Vals Legal Bench
Vals LiveCodeBench
Vals MedQA
Missing frontier models
No obvious gaps right now.
โถTaxonomy & task details
Core tasks
Required modes
Domains
Related in Companion
Casual chat companion
Engaging conversation with consistent tone and context.
Life coaching and goal planning
Goal setting, habit planning, and accountability check-ins.
Tarot-style reading
Symbolic, personalized readings with consistent persona.
Empathetic support chat
Supportive conversation with strong boundaries and safe escalation.