GitHub
|EN
Asclepius Leaderboard
If you wish to make your evaluation results public on the leaderboard below after the evaluation is completed, please send an email to asclepius@gmail.com.
Our Asclepius benchmark evaluates Med-MLLMs for the Specialty Analysis with 15 medical specialties and the Capacity Analysis with 8 clinical tasks.
⚕Specialty Analysis
Method
Overall
Card
Derm
Endo
Gastro
GenSurg
Hem
Immun
Neurol
ObsGyn
Ophth
Orth
Oto
Path
Pulm
Urol
1
OpenAI
0.5220.6090.4550.5260.4340.3700.6330.4850.5500.5040.4050.2140.3630.4890.399
2
Google
0.3270.3140.3640.4120.2790.2600.5000.3320.3900.4690.2740.1070.3060.4360.288
3
Stanford AIMI
0.2780.3080.1820.2890.3800.2420.1670.1920.2400.4310.1890.1070.2790.4240.294
4
Shanghai AI Laborato
0.3220.1950.0910.3250.4500.2380.1330.2790.2200.2960.2210.2140.2440.3070.288
5
Stanford University
0.2880.2370.0000.2680.3020.2750.1330.2140.1900.3720.1680.1790.2070.4050.221
6
MBZUAI
0.2100.1780.0450.1910.1630.1170.1670.1350.1100.2050.1420.0360.0860.1510.129
⚕Capacity Analysis
Method
Overall
Anato
Attr
SpaQua
DisIde
Stag
Prog
Treat
Rep
1
OpenAI
0.4620.3230.3850.5520.6490.480.5040.524N.A.
2
Google
0.3540.2850.2920.3420.6540.3420.4960.3230.082
3
Stanford AIMI
0.3090.2380.2530.3210.5240.2520.4510.3150.157
4
Shanghai AI Laboratoy
0.2780.3440.2980.2120.130.3960.2950.290.091
5
Stanford University
0.2790.270.2560.2170.5870.2720.3980.1450.133
6
MBZUAI
0.1480.1630.1070.1520.0820.1040.2230.1450.078