Tagged "mixture-of-experts"
- Qwen3.5-35B-A3B Emerges as Game-Changer for Agentic Coding Tasks
- Alibaba's Qwen3.5-397B Achieves #3 Position in Open Weights Model Rankings
- Qwen3-Next 80B MoE Achieves 39 Tokens/Second on RTX 5070/5060 Ti Dual-GPU Setup
- Qwen 3.5-397B-A17B Now Available for Local Inference with Aggressive Quantisation
- MiniMax M2.5: 230B Parameter MoE Model Coming to HuggingFace
- Ming-flash-omni-2.0: 100B MoE Omni-Modal Model Released
- GLM-5 Released: 744B Parameter MoE Model Targeting Complex Tasks