โ All stories
Self-Distillation Trade-Off: Accuracy vs. Output Diversity
A new arXiv paper reveals that on-policy self-distillation with sampled demonstrations improves pass@1 accuracy but reduces output diversity and flattens pass@k curves due to compounding biases in the feedback mechanism.
One continuously updated timeline instead of dozens of separate articles. New developments are appended as the story evolves.
- BenchmarkJun 24, 2026, 05:59 PM 84%
Research shows on-policy self-distillation improves pass@1 accuracy but reduces output diversity and flattens pass@k curves
A new arXiv paper reveals that on-policy self-distillation with sampled demonstrations improves pass@1 accuracy but reduces output diversity and flattens pass@k curves due to compounding biases in the feedback mechanism.
Read the full story โ