Yang's picture

Yang

Kaichengalex

·

https://kaicheng-yang0828.github.io/

Kaicheng-Yang0828

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

Qwen/Qwen-Image

liked a dataset 3 days ago

zzliang/GRIT

liked a dataset 7 days ago

zhixiangwei/VLM-150M

View all activity

Organizations

upvoted 3 papers 9 days ago

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published 10 days ago • 20

ForCenNet: Foreground-Centric Network for Document Image Rectification

Paper • 2507.19804 • Published 12 days ago • 11

Region-based Cluster Discrimination for Visual Representation Learning

Paper • 2507.20025 • Published 12 days ago • 17

upvoted a collection about 2 months ago

DatologyAI CLIP Models

SoTA Image-Text Classification and Retrieval models using only data curation -- for full details please see our blog: https://blog.datologyai.com/ • 2 items • Updated Jun 10 • 5

upvoted 2 papers 2 months ago

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 68

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 176

upvoted 3 papers 3 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 272

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 149

FG-CLIP: Fine-Grained Visual and Textual Alignment

Paper • 2505.05071 • Published May 8 • 18

upvoted 3 collections 3 months ago

OpenVision

27 items • Updated May 8 • 29

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 62

Qwen3

84 items • Updated about 21 hours ago • 1.03k

upvoted 2 papers 3 months ago

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24 • 114

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Paper • 2504.17432 • Published Apr 24 • 39

upvoted a paper 4 months ago

Decoupled Global-Local Alignment for Improving Compositional Understanding

Paper • 2504.16801 • Published Apr 23 • 15

upvoted 2 collections 4 months ago

MLCD-VL

2 items • Updated May 16 • 1

UniME

UniME is a series of multimodal large language models trained for learning universal multimodal embedding. • 4 items • Updated May 16 • 4

upvoted 2 papers 4 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 280

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 134

upvoted a paper 6 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 146