MiniMax M3: Sparse Attention, 1M Context, and the Agent Model Nobody Should Evaluate Lazily

MiniMax M3 feature image showing sparse attention and 1M context evaluation

MiniMax M3 arrives with the kind of launch copy that makes engineers both curious and allergic. Frontier coding. One million tokens. Native multimodality. Open weights coming soon. Cheaper than the usual suspects. Somewhere, a product manager is already updating a roadmap slide with fireworks. The interesting part is not the fireworks. It’s the engineering bet … Read more