DruxAI

Live Leaderboard

AI Model Rankings

Ranked by Intelligence Index — a composite quality score across reasoning, coding, math, and instruction following. Updated daily.

Last updated: 2026-05-24 · 100 models

#ModelIntelligence
1GPT-5.5 (xhigh)
60.2
2GPT-5.5 (high)
58.9
3Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
57.3
4Gemini 3.1 Pro Preview
57.2
5GPT-5.4 (xhigh)
56.8
6GPT-5.5 (medium)
56.7
7Qwen3.7 Max
56.6
8Gemini 3.5 Flash (high)
55.3
9Kimi K2.6
53.9
10MiMo-V2.5-Pro
53.8
11GPT-5.3 Codex (xhigh)
53.6
12Grok 4.3 (high)
53.2
13Claude Opus 4.6 (Adaptive Reasoning, Max Effort)
52.9
14Muse Spark
52.2
15Claude Opus 4.7 (Non-reasoning, High Effort)
51.8
16Qwen3.6 Max Preview
51.8
17Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
51.7
18DeepSeek V4 Pro (Reasoning, Max Effort)
51.5
19GLM-5.1 (Reasoning)
51.4
20GPT-5.2 (xhigh)
51.3
21GPT-5.5 (low)
50.8
22Qwen3.6 Plus
50
23DeepSeek V4 Pro (Reasoning, High Effort)
49.8
24GLM-5 (Reasoning)
49.8
25Claude Opus 4.5 (Reasoning)
49.7
26MiniMax-M2.7
49.6
27Grok 4.20 0309 v2 (Reasoning)
49.3
28MiMo-V2-Pro
49.2
29MiMo-V2.5
49
30GPT-5.2 Codex (xhigh)
49
31GPT-5.4 mini (xhigh)
48.9
32Grok 4.3 (medium)
48.8
33Grok 4.20 0309 (Reasoning)
48.5
34Gemini 3 Pro Preview (high)
48.4
35GPT-5.4 (low)
47.9
36GPT-5.1 (high)
47.7
37GLM-5-Turbo
46.8
38Kimi K2.5 (Reasoning)
46.8
39GPT-5.2 (medium)
46.6
40DeepSeek V4 Flash (Reasoning, Max Effort)
46.5
41Claude Opus 4.6 (Non-reasoning, High Effort)
46.5
42Gemini 3 Flash Preview (Reasoning)
46.4
43DeepSeek V4 Flash (Reasoning, High Effort)
46
44Qwen3.6 27B (Reasoning)
45.8
45Qwen3.5 397B A17B (Reasoning)
45
46MiMo-V2-Omni-0327
44.9
47GPT-5 (high)
44.6
48GPT-5 Codex (high)
44.6
49Claude Sonnet 4.6 (Non-reasoning, High Effort)
44.4
50GPT-5.4 nano (xhigh)
44
51Grok 4.3 (low)
43.9
52KAT Coder Pro V2
43.8
53GLM-5.1 (Non-reasoning)
43.8
54Qwen3.6 35B A3B (Reasoning)
43.5
55MiMo-V2-Omni
43.4
56Gemini 3.5 Flash (minimal)
43.3
57GPT-5.1 Codex (high)
43.1
58Claude Opus 4.5 (Non-reasoning)
43.1
59Claude 4.5 Sonnet (Reasoning)
43
60Kimi K2.6 (Non-reasoning)
42.9
61GLM 5V Turbo (Reasoning)
42.9
62Claude Sonnet 4.6 (Non-reasoning, Low Effort)
42.6
63GLM-4.7 (Reasoning)
42.1
64Qwen3.5 27B (Reasoning)
42.1
65GPT-5 (medium)
42
66Claude 4.1 Opus (Reasoning)
42
67Hy3-preview (Reasoning)
41.9
68MiniMax-M2.5
41.9
69DeepSeek V3.2 (Reasoning)
41.7
70Qwen3.5 122B A10B (Reasoning)
41.6
71MiMo-V2-Flash (Feb 2026)
41.5
72Grok 4
41.5
73Gemini 3 Pro Preview (low)
41.3
74GPT-5 mini (high)
41.2
75GPT-5.5 (Non-reasoning)
40.9
76Kimi K2 Thinking
40.9
77o3-pro
40.7
78GLM-5 (Non-reasoning)
40.6
79Qwen3.5 397B A17B (Non-reasoning)
40.1
80Qwen3 Max Thinking
39.8
81MiniMax-M2.1
39.4
82DeepSeek V4 Pro (Non-reasoning)
39.3
83Gemma 4 31B (Reasoning)
39.2
84Mistral Medium 3.5
39.2
85GPT-5 (low)
39.2
86MiMo-V2-Flash (Reasoning)
39.2
87Claude 4 Opus (Reasoning)
39
88GPT-5 mini (medium)
38.9
89Claude 4 Sonnet (Reasoning)
38.7
90Grok 4.1 Fast (Reasoning)
38.6
91Qwen3.5 Omni Plus
38.6
92GPT-5.1 Codex mini (high)
38.6
93Step 3.5 Flash 2603
38.5
94Ring-2.6-1T
38.5
95o3
38.4
96GPT-5.4 nano (medium)
38.1
97Step 3.5 Flash
37.8
98GPT-5.4 mini (medium)
37.7
99Kimi K2.5 (Non-reasoning)
37.3
100Command A+
37.2

Get the weekly AI Rankings digest

Top movers, new models, and what changed — every Monday.

About this ranking

Intelligence Index is a composite quality score aggregating performance across reasoning, coding, math, and instruction-following benchmarks into a single comparable number. Pricing and speed are measured by running models at scale under controlled conditions.

Data source: Artificial Analysis — independent AI benchmarking. Intelligence Index, benchmark scores, pricing, and speed data are © Artificial Analysis. DruxAI fetches this data daily via the Artificial Analysis API and does not modify or influence the scores.

Model and creator identifiers used per Artificial Analysis API guidelines.