Qwen/Qwen3-Coder-30B-A3B-Instruct Text Generation • 31B • Updated 12 days ago • 336k • • 535
Qwen/Qwen3-Coder-480B-A35B-Instruct Text Generation • 480B • Updated 12 days ago • 153k • • 1.15k
Scaling Reasoning can Improve Factuality in Large Language Models Paper • 2505.11140 • Published May 16 • 7
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 120
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 69