| Model | Lane | Think | Short tok/s | JSON tok/s | Sustained tok/s | First answer ms |
|---|---|---|---|---|---|---|
| Phi 4 Mini | core | default | 169.4 | 146.3 | 162.2 | 95.1 |
| Gemma 3 4B | core | default | 148.8 | 134.4 | 125.9 | 137.7 |
| Qwen2.5 Coder 7B | core | default | 114.2 | 107.3 | 112.2 | 72.1 |
| Qwen2.5 Instruct 7B | core | default | 112.1 | 107.2 | 110.8 | 80.0 |
| DeepSeek R1 Distill Llama 8B | core | default | 104.7 | 103.9 | 103.1 | 93.4 |
| Llama 3.1 8B | core | default | 104.9 | 99.9 | 102.3 | 70.9 |
| Qwen3.5 9B | core | default | 98.7 | 97.3 | 84.6 | 2073.4 |
| Ministral 3 8B | core | default | 97.7 | 94.0 | 93.8 | 78.6 |
| Ministral 3 14B | core | default | 64.2 | 61.5 | 61.8 | 84.7 |
| Qwen2.5 Coder 14B | core | default | 57.8 | 55.3 | 45.6 | 93.3 |
| Qwen2.5 Instruct 14B | core | default | 56.9 | 54.9 | 56.6 | 100.5 |
| Gemma 3 12B | core | default | 52.0 | 45.7 | 55.9 | 171.8 |
| Qwen3.5 27B Opus Distilled v2 | frontier_27b | off | 32.4 | 32.2 | 31.7 | 4623.0 |
| Qwen3.5 27B Opus Distilled v2 | frontier_27b | on | 31.9 | 31.9 | 31.6 | 8021.5 |
| Qwen3.5 27B Base | frontier_27b | on | 32.3 | 30.9 | 31.2 | 7938.0 |
| Qwen3.5 27B Base | frontier_27b | off | 31.6 | 29.3 | 30.8 | 255.1 |
| Qwen2.5 Coder 32B | core | default | 27.3 | 26.2 | 25.6 | 134.1 |
| Gemma 3 27B | core | default | 28.2 | 25.8 | 27.5 | 206.2 |