Ring is a reasoning Mixture-of-Experts (MoE) large language model (LLM) developed by inclusionAI. It is built from or derived from Ling. Its design emphasizes reasoning, efficiency, and modular expert activation. In its “flash” variant (Ring-flash-2.0), it optimizes inference by activating only a subset of experts. It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance. Reasoning-optimized model with reinforcement learning enhancements. Efficient architecture and memory design for large-scale reasoning. If you are located in mainland China, we also provide the model on ModelScope.cn to speed up the download process.
Features
- Mixture-of-Experts (MoE) architecture (activates a subset of experts per input)
- Reasoning-optimized model with reinforcement learning enhancements
- “Thinking” model variant (flash) with sparse expert activation (e.g. 1/32 expert activation)
- High inference throughput (e.g. > 200 tokens/sec under optimized settings)
- Multi-stage training: SFT + RLVR + RLHF
- Efficient architecture and memory design for large-scale reasoning