AI Research 75% 1 min readJul 1, 2026, 8:10 AM

NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregressive Nemotron-3-Nano-30B-A3B Backbone

30-second summary

NVIDIA released Nemotron-Labs-TwoTower, an open-weight diffusion language model leveraging a frozen autoregressive backbone to boost text generation throughput.

NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregressive Nemotron-3-Nano-30B-A3B Backbone

Key takeaways

Nemotron-Labs-TwoTower is a diffusion language model built on a frozen autoregressive backbone (Nemotron-3-Nano-30B-A3B).
The model is released as open weights under NVIDIA's Nemotron Open Model License.
Diffusion-based generation enables parallel token processing, addressing throughput bottlenecks in autoregressive models.
Targeted at developers and enterprises needing high-speed text generation for large-scale applications.

Full story

NVIDIA has introduced Nemotron-Labs-TwoTower, a diffusion-based language model designed to address the throughput limitations of traditional autoregressive (AR) models. Unlike AR models that generate tokens sequentially, Nemotron-Labs-TwoTower uses a diffusion approach, enabling parallel token generation and significantly improving generation speed.

The model is built on a frozen autoregressive backbone, Nemotron-3-Nano-30B-A3B, and released under the NVIDIA Nemotron Open Model License as open weights. This hybrid approach aims to combine the strengths of both diffusion and autoregressive models, offering a practical solution for high-throughput text generation tasks.

The release targets a critical bottleneck in AR models, where token-by-token decoding restricts performance in large-scale applications. By leveraging diffusion techniques, NVIDIA positions Nemotron-Labs-TwoTower as a scalable alternative for developers and enterprises requiring efficient text generation.

Source: NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregressive Nemotron-3-Nano-30B-A3B Backbone. Read the full piece at the source.

Why this matters

Developers

Provides an open-weight, high-throughput alternative to autoregressive models for text generation.

Businesses

Enables faster and more scalable text generation, reducing latency in AI-driven applications.

Investors

Students

Everyone

Advances the efficiency of AI text generation with a novel hybrid model approach.

Glossary

Diffusion language model: A model that generates text by iteratively refining a sequence of tokens, enabling parallel processing and faster throughput compared to autoregressive models.
Autoregressive (AR) model: A model that generates tokens one at a time in sequence, where each token depends on the previous ones, leading to slower generation for long sequences.

Sources · 1

NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregressive Nemotron-3-Nano-30B-A3B Backbone ↗

The Untaught Lessons of RAG Retrieval: Cosine Is Not the Foundation

1 min read2h ago

A behind-the-scenes look at Midjourney’s medical scanner leaves many questions unanswered

1 min read2h ago

GPT and Claude failed Bridgewater's finance tests because the right answers were never public

1 min read2h ago

TickrWire

Romanian-American University Integrates Artificial Intelligence and Critical Thinking Across All Degree Programs - Romania Insider

1 min read3h ago