All Leaderboards

Language Model Leaderboard

Compare 50 leading LLMs across performance, cost, and capabilities

RankProviderCapabilities
1
GPT-5.2
gpt-5.2
OpenAI
92.8%95.1%200K$20.00 / $80.00
Advanced Reasoning
Vision
Code Generation
+2
2
OpenAI o1-preview
o1-preview
OpenAI
92.3%-128K$15.00 / $60.00
Advanced Reasoning
Chain of Thought
Science
+1
3
Gemini 3 Pro
gemini-3-pro
Google
91.5%93.8%1000K$15.00 / $60.00
Deep Think
Vision
Audio
+2
4
GPT-5.1
gpt-5.1
OpenAI
91.2%94.3%200K$18.00 / $72.00
Adaptive Reasoning
Vision
Code Generation
+1
5
GPT-5
gpt-5
OpenAI
90.4%93.5%200K$15.00 / $60.00
Advanced Reasoning
Vision
Code Generation
+1
6
Claude 3.5 Sonnet
claude-3-5-sonnet-20241022
Anthropic
90.4%92.0%200K$3.00 / $15.00
Long Context
Vision
Code Generation
+2
7
Gemini 2.5 Pro
gemini-2.5-pro
Google
89.8%91.5%1000K$10.00 / $40.00
Deep Think
Vision
Audio
+2
8
Grok 4.1
grok-4.1
xAI
88.9%90.7%131K$5.00 / $15.00
Real-time Data
Vision
Less Filtered
+1
9
Claude Opus 4.5
claude-opus-4-5
Anthropic
88.7%92.0%200K$15.00 / $75.00
Long Context
Vision
Code Generation
+3
10
Llama 3.1 405B
llama-3.1-405b-instruct
Meta
88.6%89.0%128K$2.70 / $2.70
Open Source
Long Context
Multilingual
+1
11
Claude Sonnet 4.5
claude-sonnet-4-5
Anthropic
88.3%93.7%1000K$3.00 / $15.00
Long Context
Vision
Code Generation
+2
12
Qwen 3 72B
qwen-3-72b-instruct
Alibaba
87.8%89.2%131K$0.90 / $0.90
Open Source
Multilingual
Code
+2
13
Grok 2
grok-2-1212
xAI
87.5%88.4%131K$2.00 / $10.00
Real-time Data
Vision
Less Filtered
14
GPT-4o
gpt-4o-2024-11-20
OpenAI
87.2%90.2%128K$2.50 / $10.00
Vision
Audio
Fast
+1
15
Claude 3 Opus
claude-3-opus-20240229
Anthropic
86.8%84.9%200K$15.00 / $75.00
Long Context
Vision
Analysis
+1
16
GPT-4 Turbo
gpt-4-turbo-2024-04-09
OpenAI
86.4%87.2%128K$10.00 / $30.00
Vision
JSON Mode
Function Calling
+1
17
Llama 3.1 70B
llama-3.1-70b-instruct
Meta
86.0%80.5%128K$0.88 / $0.88
Open Source
Long Context
Efficient
+1
18
Llama 3.3 70B
llama-3.3-70b-instruct
Meta
86.0%88.4%128K$0.88 / $0.88
Open Source
Long Context
Multilingual
+1
19
Gemini 1.5 Pro
gemini-1.5-pro
Google
85.9%84.1%2000K$1.25 / $5.00
Extreme Long Context
Vision
Audio
+1
20
DeepSeek V3.2
deepseek-v3.2
DeepSeek
85.7%95.8%128K$0.60 / $2.40
Reasoning
Code
Math
+2
21
Gemini 3 Flash
gemini-3-flash
Google
85.4%87.2%1000K$0.15 / $0.60
Ultra Fast
Long Context
Multimodal
+1
22
Qwen 2.5 72B
qwen-2.5-72b-instruct
Alibaba
85.3%86.0%131K$0.90 / $0.90
Open Source
Multilingual
Code
+1
23
OpenAI o1-mini
o1-mini
OpenAI
85.2%94.6%128K$3.00 / $12.00
Reasoning
Code
Math
+1
24
Qwen2.5-Coder-32B
qwen-2.5-coder-32b-instruct
Alibaba
85.0%92.0%131K$0.90 / $0.90
Code Generation
Open Source
Code Completion
+1
25
Phi-4
phi-4
Microsoft
84.8%82.6%16K$0.10 / $0.10
Small Model
Math
Reasoning
+1
26
Grok 4 Fast
grok-4-fast
xAI
84.2%86.4%2000K$2.00 / $8.00
Extreme Long Context
Real-time Data
Fast
+1
27
Qwen2 72B
qwen-2-72b-instruct
Alibaba
84.2%86.0%131K$0.90 / $0.90
Open Source
Multilingual
Code
+2
28
Mistral Large 2
mistral-large-2407
Mistral
84.0%92.0%128K$3.00 / $9.00
Function Calling
JSON Mode
Multilingual
+1
29
GPT-4o mini
gpt-4o-mini-2024-07-18
OpenAI
82.0%87.2%128K$0.15 / $0.60
Multimodal
Fast
Affordable
+1
30
Nemotron 4 340B
nemotron-4-340b-instruct
NVIDIA
81.0%73.0%4K$0.00 / $0.00
Open Source
RLHF Optimized
Free
31
Jamba 1.5 Large
jamba-1.5-large
AI21 Labs
80.3%68.2%256K$2.00 / $8.00
Extremely Long Context
Hybrid Architecture
Multilingual
32
DeepSeek R1
deepseek-r1
DeepSeek
79.8%96.3%64K$0.55 / $2.19
Reasoning
Code
Math
+1
33
DeepSeek Coder V2
deepseek-coder-v2-236b
DeepSeek
79.2%90.2%128K$0.28 / $0.42
Code Generation
MoE Architecture
128K Context
+1
34
Reka Core
reka-core-20240501
Reka AI
78.8%74.8%128K$3.00 / $15.00
Multimodal
Vision
Audio
+1
35
Amazon Nova Pro
amazon-nova-pro-v1
Amazon
78.7%84.0%300K$0.80 / $3.20
Long Context
Vision
AWS Native
+1
36
Claude 4.5 Haiku
claude-haiku-4-5
Anthropic
78.5%82.3%200K$0.80 / $4.00
Fast
Long Context
Vision
+1
37
Inflection 2.5 (Pi)
inflection-2.5
Inflection
78.0%-33K$0.00 / $0.00
Conversational
Empathetic
Free
38
Mixtral 8x22B
mixtral-8x22b-instruct
Mistral
77.7%75.0%64K$2.00 / $6.00
Open Source
MoE Architecture
Multilingual
+1
39
Yi Large
yi-large
01.AI
76.3%77.9%33K$3.00 / $3.00
Bilingual
Code
Long Context
40
Mistral Medium
mistral-medium-2312
Mistral
75.3%76.0%32K$2.70 / $8.10
Function Calling
JSON Mode
Multilingual
41
Claude 4 Haiku
claude-haiku-4-20250514
Anthropic
75.2%75.9%200K$0.80 / $4.00
Fast
Long Context
Vision
+1
42
Command R+
command-r-plus
Cohere
75.0%70.0%128K$3.00 / $15.00
RAG Optimized
Tool Use
Multilingual
+1
43
Palmyra X 004
palmyra-x-004
Writer
75.0%-128K$2.50 / $10.00
Enterprise
Graph RAG
Knowledge Graphs
44
DBRX Instruct
dbrx-instruct
Databricks
73.7%70.8%33K$0.75 / $2.25
Open Source
MoE Architecture
Enterprise
45
Llama 3.1 8B
llama-3.1-8b-instruct
Meta
73.0%72.6%128K$0.05 / $0.08
Open Source
Efficient
Long Context
+1
46
Sonar Large Online
sonar-large-32k-online
Perplexity
72.0%-33K$1.00 / $1.00
Real-time Search
Citations
Web Access
47
Gemini 2.0 Flash
gemini-flash-2.0
Google
71.9%74.4%1049K$0.10 / $0.40
Ultra Fast
Long Context
Multimodal
+1
48
Gemini 2.0 Flash Thinking
gemini-2.0-flash-thinking-exp
Google
71.9%74.4%1000K$0.10 / $0.40
Reasoning
Thinking Mode
Long Context
+2
49
GPT-3.5 Turbo
gpt-3.5-turbo-0125
OpenAI
70.0%76.8%16K$0.50 / $1.50
Fast
Affordable
Function Calling
50
Codestral
codestral-22b
Mistral
70.0%81.1%256K$1.00 / $3.00
Code Generation
80+ Languages
Largest Context for Coding
+1
Showing 50 of 50 models