Meituan Releases LongCat-2.0: A 1.6T-Parameter Open MoE Model with Native 1M Context and LongCat Sparse Attention
Meituan released LongCat-2.0, a 1.6 trillion-parameter Mixture-of-Experts model featuring a native 1 million token context window and LongCat Sparse Attention. The model activates 48 billion parameters per token and is trained on domestic AI ASIC superpods.

- LongCat-2.0 is a 1.6 trillion-parameter MoE model with a native 1 million token context window, enabled by LongCat Sparse Attention.
- The model activates only 48 billion parameters per token, balancing performance and computational efficiency.
- Training and serving occur on domestic AI ASIC superpods, reflecting China's push for self-sufficient AI infrastructure.
- Vendor-reported benchmarks are provided, but third-party validation of performance claims is pending.
Chinese tech giant Meituan has launched LongCat-2.0, a groundbreaking Mixture-of-Experts (MoE) model with 1.6 trillion parameters. The model activates approximately 48 billion parameters per token, significantly reducing computational overhead while maintaining high performance. A standout feature is its native 1 million token context window, enabled by LongCat Sparse Attention, which allows for processing extremely long sequences without the need for external memory tricks.
Training and inference are conducted entirely on domestic AI ASIC superpods, marking a shift toward self-reliant AI infrastructure in China. The release includes detailed architecture breakdowns, vendor-reported benchmarks, and API access pathways. However, some performance claims remain unverified by third-party evaluations, leaving room for independent validation.
The model's architecture leverages sparse attention mechanisms to efficiently handle long contexts, a critical advancement for applications requiring deep document analysis, extended conversations, or large-scale data processing. Meituan positions LongCat-2.0 as an open model, though the extent of its openness and licensing terms are not fully clarified in the announcement.
Source: Meituan Releases LongCat-2.0: A 1.6T-Parameter Open MoE Model with Native 1M Context and LongCat Sparse Attention. Read the full piece at the source.
Developers gain access to a high-parameter MoE model with extreme context length, useful for long-sequence tasks like document analysis or extended dialogues.
Companies in China may benefit from reduced reliance on foreign AI hardware and infrastructure.
Students studying large-scale AI models or MoE architectures have a new case study to analyze.
Advances in long-context modeling could enable more sophisticated AI applications.
- Mixture-of-Experts (MoE)
- A machine learning architecture where multiple specialized sub-models (experts) are conditionally activated for each input, improving efficiency and scalability.
- LongCat Sparse Attention
- A sparse attention mechanism designed to efficiently process extremely long sequences by focusing only on relevant segments of the input.
- ASIC superpods
- Custom-built hardware clusters optimized for AI workloads, often used for training and inference in large-scale models.
UNO AI conference sells out two years running as workers seek to adapt - KOLN | Nebraska Local News, Weather, Sports | Lincoln, NE
Global push for AI governance amid warnings of ‘catastrophic harm’ - UN News
AI Company Tells Portland School Board It Does Not Collect Student Data - Willamette Week
