-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 56 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
Collections
Discover the best community collections!
Collections including paper arxiv:2507.02813
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 276 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 66 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 37 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 64
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Paper • 2505.17955 • Published • 22 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 45 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60
-
4
Flux Fill Outpainting
👈Extend images using AI to change size and alignment
-
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper • 2506.02620 • Published • 14 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70
-
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Paper • 2507.13344 • Published • 56 -
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper • 2507.13347 • Published • 64 -
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
Paper • 2507.10065 • Published • 24 -
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering
Paper • 2507.08776 • Published • 54
-
Diffusion Classifiers Understand Compositionality, but Conditions Apply
Paper • 2505.17955 • Published • 22 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 45 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 276 • 96 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 35 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
4
Flux Fill Outpainting
👈Extend images using AI to change size and alignment
-
FlexPainter: Flexible and Multi-View Consistent Texture Generation
Paper • 2506.02620 • Published • 14 -
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Paper • 2507.02813 • Published • 60 -
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 66 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 37 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 17 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 64
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 70
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9