Accuracy scores on the test set (3,040 examples) of
MATH-Vision.
🚨 To submit your results to the leaderboard, please send to this
email with your results in this
format.
(The format template is on GitHub. If the download fails, please check your network.)
# | Model | Source | Date | ALL | Alg | AnaG | Ari | CombG | Comb | Cnt | DescG | GrphT | Log | Angle | Area | Len | SolG | Stat | Topo | TransG |
0 | Human | Link | 2024-04-05 | 68.82 | 55.1 | 78.6 | 99.6 | 98.4 | 43.5 | 98.5 | 91.3 | 62.2 | 61.3 | 33.5 | 47.2 | 73.5 | 87.3 | 93.1 | 99.8 | 69.0 |
1 | Gemini 2.5 Pro 🥇 | Link | 2025-03-23 | 73.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2 | Seed1.5-VL 🥈 | Link | 2025-05-12 | 68.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
3 | OpenAI o1 🥉 | Link | 2025-04-10 | 60.30 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
4 | Step R1-V-Mini | Link | 2025-04-05 | 56.6 | 58.0 | 64.3 | 62.9 | 43.2 | 53.6 | 28.4 | 33.7 | 34.4 | 56.3 | 66.5 | 65.8 | 69.3 | 53.3 | 58.6 | 30.4 | 46.4 |
5 | SenseNova V6 Reasoner | Link | 2025-04-10 | 55.39 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
6 | Kimi k1.6 Preview | Link | 2025-03-08 | 53.29 | 63.19 | 54.76 | 66.43 | 37.34 | 51.79 | 35.82 | 22.12 | 34.44 | 59.66 | 57.23 | 57.80 | 67.04 | 47.95 | 55.17 | 17.39 | 41.67 |
7 | Skywork-R1V2-38B | Link | 2025-04-28 | 49.7 | 52.6 | 47.4 | 73.7 | 42.1 | 52.6 | 36.8 | 15.8 | 57.9 | 73.7 | 63.2 | 73.7 | 57.9 | 47.4 | 47.4 | 21.1 | 31.6 |
8 | Doubao-1.5-pro | Link | 2025-02-28 | 48.62 | 55.07 | 52.38 | 63.57 | 34.74 | 36.90 | 43.28 | 25.00 | 27.78 | 37.82 | 62.43 | 55.40 | 59.69 | 43.85 | 55.17 | 26.09 | 37.50 |
9 | GPT-4.5 | Link | 2025-04-10 | 47.30 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
10 | VL-Rethinker-72B | Link | 2025-03-25 | 44.93 | 49.0 | 48.8 | 59.3 | 35.4 | 33.9 | 22.4 | 24.0 | 32.2 | 42.9 | 56.1 | 50.0 | 52.8 | 41.0 | 65.5 | 30.4 | 34.5 |
11 | INFRL-Qwen2.5-VL-72B-Preview | Link | 2025-03-25 | 42.73 | 49.3 | 42.9 | 59.3 | 31.8 | 32.7 | 32.8 | 22.1 | 27.8 | 41.2 | 54.3 | 47.6 | 48.1 | 38.9 | 56.9 | 26.1 | 33.3 |
12 | Gemini-2 Flash | Link | 2025-02-05 | 41.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
13 | Kimi k1.5 | Link | 2025-01-22 | 38.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
14 | Qwen2.5-VL-72B | Link | 2025-01-26 | 38.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
15 | Claude3.5-Sonnet | Link | 2024-06-21 | 37.99 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
16 | Kimi-VL-A3B-Thinking | Link | 2025-04-11 | 36.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
17 | QvQ-72B-Preview | Link | 2024-12-25 | 35.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
18 | GPT-4o | Link | 2024-05-19 | 30.39 | 42.0 | 39.3 | 49.3 | 28.9 | 25.6 | 22.4 | 24.0 | 23.3 | 29.4 | 17.3 | 29.8 | 30.1 | 29.1 | 44.8 | 34.8 | 17.9 |
19 | GPT-4 Turbo | Link | 2024-05-19 | 30.26 | 37.7 | 33.3 | 46.4 | 25.0 | 28.6 | 25.3 | 15.4 | 27.8 | 31.9 | 30.6 | 29.0 | 31.9 | 28.7 | 37.9 | 17.4 | 23.2 |
20 | Claude3-Opus | Link | 2024-05-04 | 27.13 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
21 | MathCoder-VL-8B | Link | 2025-02-16 | 26.1 | 18.6 | 32.1 | 26.4 | 25.0 | 10.7 | 13.4 | 20.2 | 14.4 | 21.0 | 48.6 | 32.2 | 32.1 | 23.0 | 29.3 | 8.7 | 23.2 |
22 | Qwen2-VL-72B | Link | 2024-08-29 | 25.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
23 | Qwen2.5-VL-7B | Link | 2025-01-26 | 25.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
24 | TBAC-VLR1-3B-preview | Link | 2025-04-21 | 25.0 | 22.0 | 29.8 | 32.1 | 19.5 | 18.5 | 16.4 | 22.1 | 11.1 | 25.2 | 39.3 | 27.6 | 28.5 | 22.9 | 34.5 | 17.4 | 22.0 |
25 | CoT GPT4V | Link | 2024-02-21 | 23.98 | 26.7 | 26.2 | 38.6 | 22.1 | 24.4 | 19.4 | 27.9 | 23.3 | 25.2 | 17.3 | 21.4 | 23.4 | 23.8 | 25.9 | 4.4 | 25.6 |
26 | GPT4V | Link | 2024-02-21 | 22.76 | 27.3 | 32.1 | 35.7 | 21.1 | 16.7 | 13.4 | 22.1 | 14.4 | 16.8 | 22.0 | 22.2 | 20.9 | 23.8 | 24.1 | 21.7 | 25.6 |
27 | MathCoder-VL-2B | Link | 2025-02-16 | 21.7 | 15.7 | 17.9 | 17.1 | 19.2 | 11.3 | 14.9 | 26.9 | 14.4 | 16.8 | 38.2 | 25.4 | 26.9 | 15.6 | 36.2 | 8.7 | 25.0 |
28 | Qwen2.5-VL-3B | Link | 2025-01-26 | 21.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
29 | Gemini-1.5 Pro | Link | 2024-05-17 | 19.24 | 20.3 | 35.7 | 34.3 | 19.8 | 15.5 | 20.9 | 26.0 | 26.7 | 22.7 | 14.5 | 14.4 | 16.5 | 18.9 | 10.3 | 26.1 | 17.3 |
30 | Ovis1.6-Gemma2-9B | Link | 2024-09-19 | 18.78 | 13.3 | 15.5 | 22.1 | 17.9 | 11.3 | 22.4 | 23.1 | 20.0 | 20.2 | 20.8 | 18.0 | 24.7 | 15.6 | 20.7 | 17.4 | 20.8 |
31 | Gemini Pro | Link | 2024-02-21 | 17.66 | 15.1 | 10.7 | 20.7 | 20.1 | 11.9 | 7.5 | 20.2 | 21.1 | 16.8 | 19.1 | 19.0 | 20.0 | 14.3 | 13.8 | 17.4 | 20.8 |
32 | InternVL-Chat-V1-2-Plus | Link | 2024-02-22 | 16.97 | 11.3 | 25.0 | 15.7 | 16.9 | 10.1 | 11.9 | 16.4 | 15.6 | 19.3 | 22.5 | 16.4 | 22.5 | 14.3 | 17.2 | 4.4 | 20.8 |
33 | Qwen2-VL-7B | Link | 2024-08-29 | 16.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
34 | Math-LLaVA-13B | Link | 2024-06-26 | 15.69 | 9.0 | 20.2 | 15.7 | 18.2 | 10.1 | 10.5 | 16.4 | 14.4 | 16.0 | 20.2 | 18.4 | 17.6 | 9.4 | 24.1 | 21.7 | 17.9 |
35 | Qwen-VL-Max | Link | 2024-02-21 | 15.59 | 10.7 | 19.1 | 20.0 | 16.9 | 12.5 | 17.9 | 16.4 | 12.2 | 21.0 | 13.3 | 14.2 | 19.8 | 11.5 | 20.7 | 13.0 | 17.3 |
36 | InternLM-XComposer2-VL | Link | 2024-02-21 | 14.54 | 9.3 | 15.5 | 12.1 | 15.3 | 11.3 | 10.5 | 14.4 | 22.2 | 19.3 | 19.7 | 15.6 | 15.0 | 11.9 | 15.5 | 26.1 | 15.5 |
37 | GPT 4-CoT (caption) | Link | 2024-02-21 | 13.10 | 16.5 | 20.2 | 34.3 | 10.4 | 17.9 | 19.4 | 7.7 | 11.1 | 10.1 | 9.8 | 9.6 | 9.1 | 13.5 | 13.8 | 8.7 | 12.5 |
38 | Qwen2-VL-2B | Link | 2024-08-29 | 12.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
39 | ShareGPT4V-13B | Link | 2024-02-21 | 11.88 | 7.5 | 15.5 | 16.4 | 10.7 | 8.9 | 9.0 | 11.5 | 8.9 | 7.6 | 11.6 | 13.0 | 17.4 | 10.3 | 8.6 | 8.7 | 12.5 |
40 | LLaVA-v1.5-13B | Link | 2024-02-21 | 11.12 | 7.0 | 14.3 | 14.3 | 9.1 | 6.6 | 6.0 | 13.5 | 5.6 | 13.5 | 10.4 | 12.6 | 14.7 | 11.5 | 13.8 | 13.0 | 10.7 |
41 | Qwen-VL-Plus | Link | 2024-02-21 | 10.72 | 11.3 | 17.9 | 14.3 | 12.7 | 4.8 | 10.5 | 15.4 | 8.9 | 14.3 | 11.6 | 6.4 | 10.0 | 14.3 | 6.9 | 8.7 | 11.31 |
42 | ShareGPT4V-7B | Link | 2024-02-21 | 10.53 | 5.5 | 3.6 | 12.9 | 10.1 | 4.8 | 7.5 | 11.5 | 14.4 | 10.9 | 16.2 | 11.8 | 12.3 | 9.8 | 15.5 | 17.4 | 11.3 |
43 | SPHINX (V2) | Link | 2024-02-21 | 9.70 | 6.7 | 7.1 | 12.9 | 7.5 | 7.7 | 6.0 | 9.6 | 16.7 | 10.1 | 11.0 | 11.8 | 12.5 | 8.2 | 8.6 | 8.7 | 6.0 |
44 | LLaVA-v1.5-7B | Link | 2024-02-21 | 8.52 | 7.0 | 7.1 | 10.7 | 7.1 | 4.8 | 10.5 | 7.7 | 10.0 | 9.2 | 15.6 | 10.2 | 9.8 | 5.3 | 8.6 | 4.4 | 4.8 |
45 | Random Chance | Link | 2024-02-21 | 7.17 | 1.5 | 11.9 | 7.1 | 9.7 | 4.8 | 6.0 | 22.1 | 1.1 | 7.6 | 0.6 | 9.4 | 6.7 | 8.2 | 8.6 | 13.0 | 7.1 |
Subjects: Alg: algebra, AnaG: analytic geometry, Ari: arithmetic, CombG: combinatorial geometry,
Comb: combinatorics, Cnt: counting, DescG: descriptive geometry, GrphT: graph theory, Log: logic,
Angle: metric geometry - angle, Area: metric geometry - area, Len: metric geometry-length,
SolG: solid geometry, Stat: statistics, Topo: topology, TransG: transformation geometry.