โ All stories
Ornith-1.0-35B GGUF Performance and Speculative-Decode Updates
A follow-up update to the Ornith-1.0-35B GGUF model introduces native MTP speculative-decode grafting, achieving 1.3-1.35x single-stream decode speed with identical token distribution to the target model.
One continuously updated timeline instead of dozens of separate articles. New developments are appended as the story evolves.
- UpdateJun 28, 2026, 06:35 PM 69%
Ornith-1.0-35B GGUF update adds native MTP speculative-decode grafting, boosting single-stream decode speed by 1.3-1.35x with identical token distribution.
A follow-up update to the Ornith-1.0-35B GGUF model introduces native MTP speculative-decode grafting, achieving 1.3-1.35x single-stream decode speed with identical token distribution to the target model.
Read the full story โ