LLM Benchmarks

llm.l.gaz.codes

gpt-oss-120b

openai/gpt-oss-120b

Publisher
openai
Architecture
gpt-oss
Quantization
MXFP4
Format
gguf
Max Context
131k tokens
Type
llm
Machine
gaz-studio
Capabilities
tool_use

knowledge

BenchmarkScore 
MMLUlocal
78.5%

reasoning

BenchmarkScore 
GSM8Klocal
85.2%
ARC-Challengelocal
75.4%

coding

BenchmarkScore 
HumanEvallocal
72.0%

language

BenchmarkScore 
HellaSwaglocal
82.3%

safety

BenchmarkScore 
TruthfulQAlocal
65.8%
LLM Benchmarks