GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 207
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published 18 days ago • 122
deepcogito/cogito-v2-preview-llama-109B-MoE Image-Text-to-Text • 109B • Updated 6 days ago • 1.05k • 24
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful Paper • 2507.07101 • Published 28 days ago • 3
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper • 2507.07996 • Published 27 days ago • 31 • 14