Makhtum Ahmed's picture

12 5

Makhtum Ahmed

anything098

·

ahmedmakhtum011

AI & ML interests

uncensored

Recent Activity

liked a dataset about 19 hours ago

WilliamHuang91/MAPO_Math_OOD_Dataset

upvoted a paper 3 days ago

Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

upvoted a paper 3 days ago

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

View all activity

Organizations

None yet

upvoted 2 papers 3 days ago

Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

Paper • 2509.15591 • Published 7 days ago • 45

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Paper • 2509.16197 • Published 7 days ago • 48

upvoted 5 papers 6 days ago

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25, 2024 • 10

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

Paper • 2509.15212 • Published 8 days ago • 20

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published 8 days ago • 100

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published 8 days ago • 101

WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance

Paper • 2509.15130 • Published 8 days ago • 30

upvoted a paper 7 days ago

LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

Paper • 2509.12203 • Published 11 days ago • 18

upvoted 4 papers 12 days ago

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published 15 days ago • 214

RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published 16 days ago • 67

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published 17 days ago • 80

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published 18 days ago • 31