MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models Paper • 2508.06009 • Published Aug 8 • 16
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning Paper • 2506.09736 • Published Jun 11 • 9
Agentic Robot: A Brain-Inspired Framework for Vision-Language-Action Models in Embodied Agents Paper • 2505.23450 • Published May 29 • 9
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published May 28 • 45
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published May 28 • 36
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 42
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 129
MLP-KAN: Unifying Deep Representation and Function Learning Paper • 2410.03027 • Published Oct 3, 2024 • 31