AudioStory: Generating Long-Form Narrative Audio with Large Language Models Paper • 2508.20088 • Published 8 days ago • 20
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts Paper • 2507.20939 • Published Jul 28 • 56 • 2
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper • 2506.16141 • Published Jun 19 • 27
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper • 2506.16141 • Published Jun 19 • 27
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published Jun 3 • 22
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper • 2505.21374 • Published May 27 • 27
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper • 2505.21374 • Published May 27 • 27