AI Research 88% 1 min readJun 24, 2026, 5:27 PM

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

Evolving story · 1 updatesNatural Ungrokking: How Language Models Forget Rules During TrainingTimeline →

30-second summary

Researchers discovered 'natural ungrokking'—where a small language model forgets learned rules (e.g., pronoun-gender resolution) mid-pretraining despite no loss curve changes, with survival tied to corpus statistics.

Key takeaways

›Natural ungrokking describes the mid-pretraining reversal of learned rules in small language models, with no trace in the loss curve.
›A model initially learned pronoun-gender rules (94% accuracy) but later forgot them (near 0% accuracy) despite unchanged training data.
›Rule survival is predictable from corpus statistics, particularly the frequency of rule application in the training stream.
›This phenomenon challenges traditional assumptions about how models retain knowledge during pretraining.
›The study suggests corpus biases play a critical role in determining which rules models retain.

Full story

A new study reveals an unexpected phenomenon in language model pretraining called 'natural ungrokking,' where a model initially learns and applies rules (e.g., pronoun-gender resolution) but later discards them without any detectable change in the loss curve. For example, a model trained on a corpus with gendered pronouns initially achieves 94% accuracy on held-out probes by step 925 but scores near zero by step 3,500, despite the rule's evidence remaining in the training data. The survival of learned rules is predictable based on corpus statistics, specifically how often the training stream shows the rule's application. This challenges assumptions about how models retain knowledge during pretraining and highlights the role of implicit corpus biases in shaping model behavior.

Source: Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining. Read the full piece at the source.

Why this matters

Developers

Highlights the need for better monitoring of model behavior during pretraining to detect rule forgetting, even when loss curves remain stable.

Businesses

Raises concerns about the reliability of language models in production, as learned rules may degrade unpredictably over training.

Investors

Identifies a potential risk in AI model training pipelines, which could impact investments in companies relying on pretrained models.

Students

Provides insight into the complexities of language model training and the factors influencing model behavior beyond loss minimization.

Everyone

Challenges public perception of AI models as static after training, showing that their behavior can change unpredictably during development.

Glossary

natural ungrokking: The mid-pretraining reversal of learned rules in language models without changes in the loss curve.
pretraining: The initial phase of training a language model on a large corpus of text to learn general patterns and rules.
loss curve: A graph showing the model's error rate during training; used to gauge learning progress.
held-out probes: Test cases not seen during training, used to evaluate the model's generalization ability.
corpus statistics: Quantitative measures of the training data, such as frequency of specific patterns or rules.

AI bias estimate: Neutral presentation of empirical findings; no overt opinion or sensationalism detected. (Automated estimate, not a definitive judgement.)

Sources · 1

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining ↗

Summary and analysis generated by AI (mistral). Always verify against the original sources.

TickrWire

NSF Prepares To Announce Artificial Intelligence Coordination Hubs - AFCEA International

1 min read5h ago

TickrWire

Chinese A.I. Models Close the Gap With Anthropic and OpenAI - The New York Times

1 min read9h ago

TickrWire

A Pilot Study on the Efficacy of Artificial Intelligence-Driven Monocular Three-Dimensional Conversion for Endoscopic Spatial Perception - Cureus

1 min read10h ago

TickrWire

Nearly 100% of patients surveyed say they’d want to know when AI is used in imaging - Radiology Business

1 min read11h ago