LLM Benchmarks

llm.l.gaz.codes

Compare Models

Select models to compare benchmark results side by side

Compare Controls

Models (1)

qwen2.5-coder-32b-instruct-mlx

Benchmarks (8/8)

Selected Models (1)

qwen2.5-coder-32b-instruct-mlx
Loading catalog...
Quick add loaded models:

Compare models at different quantization levels with RAM/VRAM/speed tradeoffs

Benchmarks (8/8)

Benchmark Comparison

Sort:
Benchmark
qwen2.5-coder-32b-instruct-mlx
ARC-ChallengereasoningHF Leaderboard
74.5%
GSM8KreasoningHF Leaderboard
84.0%
HellaSwaglanguageHF Leaderboard
86.3%
HumanEvalcodingAider
69.9%
MBPPcodingAider
66.9%
MMLUknowledgeHF Leaderboard
82.0%
TruthfulQAsafetyHF Leaderboard
68.1%
WinoGrandelanguageHF Leaderboard
76.7%
Average
76.0%
Model Specs
Parameters
-
VRAM
-
Est. Speed
-