-
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision
Paper • 2509.01360 • Published • 11 -
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos
Paper • 2507.05675 • Published • 26 -
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Paper • 2408.00874 • Published • 53
Collections
Discover the best community collections!
Collections including paper arxiv:2408.00874
-
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
Paper • 2408.16767 • Published • 33 -
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
Paper • 2411.16657 • Published • 20 -
Autoregressive Video Generation without Vector Quantization
Paper • 2412.14169 • Published • 14 -
Progressive Multimodal Reasoning via Active Retrieval
Paper • 2412.14835 • Published • 74
-
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Paper • 2404.07839 • Published • 48 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 63 -
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Paper • 2404.05674 • Published • 15 -
Agentless: Demystifying LLM-based Software Engineering Agents
Paper • 2407.01489 • Published • 64
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 130 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 59 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 15 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 71
-
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision
Paper • 2509.01360 • Published • 11 -
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos
Paper • 2507.05675 • Published • 26 -
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Paper • 2408.00874 • Published • 53
-
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
Paper • 2408.16767 • Published • 33 -
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
Paper • 2411.16657 • Published • 20 -
Autoregressive Video Generation without Vector Quantization
Paper • 2412.14169 • Published • 14 -
Progressive Multimodal Reasoning via Active Retrieval
Paper • 2412.14835 • Published • 74
-
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Paper • 2404.07839 • Published • 48 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 63 -
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Paper • 2404.05674 • Published • 15 -
Agentless: Demystifying LLM-based Software Engineering Agents
Paper • 2407.01489 • Published • 64
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 130 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 59 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 15 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 71