Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published Jun 12 • 74
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development Paper • 2506.05010 • Published Jun 5 • 75
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Paper • 2506.05573 • Published Jun 5 • 76
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper • 2505.19297 • Published May 25 • 83
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published May 5 • 86
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22 • 121
Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning Paper • 2507.21049 • Published 9 days ago • 38
Structured 3D Latents for Scalable and Versatile 3D Generation Paper • 2412.01506 • Published Dec 2, 2024 • 80
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning Paper • 2410.06373 • Published Oct 8, 2024 • 36