CineScale: Free Lunch in High-Resolution Cinematic Visual Generation Paper • 2508.15774 • Published 18 days ago • 20
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published 14 days ago • 182
Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models Paper • 2508.12945 • Published 21 days ago • 12
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model Paper • 2508.13009 • Published 21 days ago • 24
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off Paper • 2508.04825 • Published Aug 6 • 58
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published 25 days ago • 141
Matrix-3D: Omnidirectional Explorable 3D World Generation Paper • 2508.08086 • Published 28 days ago • 70
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control Paper • 2508.08134 • Published 28 days ago • 9
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published about 1 month ago • 175
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 76
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • Jan 30 • 128
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent Paper • 2506.17612 • Published Jun 21 • 63
view article Article (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware By derekl35 and 4 others • Jun 19 • 86
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48