-
Qwen/Qwen3-Coder-30B-A3B-Instruct
Text Generation • 31B • Updated • 95.1k • • 377 -
Qwen/Qwen3-30B-A3B-Thinking-2507
Text Generation • 31B • Updated • 32.9k • 176 -
Qwen/Qwen3-Coder-480B-A35B-Instruct
Text Generation • 480B • Updated • 30.4k • • 1.02k -
Qwen/Qwen3-235B-A22B-Instruct-2507
Text Generation • 235B • Updated • 32.3k • • 588
Chinese LLMs on Hugging Face
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
Seed-Coder: Let the Code Model Curate Data for Itself
Paper • 2506.03524 • Published • 6 -
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
Paper • 2504.13914 • Published • 4 -
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Paper • 2503.10772 • Published • 19 -
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Paper • 2503.09949 • Published • 5
text-to-video & image-to-video models released by the Chinese community
-
MoBA: Mixture of Block Attention for Long-Context LLMs
Paper • 2502.13189 • Published • 17 -
Kimi-Audio Technical Report
Paper • 2504.18425 • Published • 19 -
Kimi-VL Technical Report
Paper • 2504.07491 • Published • 133 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 123
-
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
Paper • 2504.21801 • Published • 2 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 415 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 68 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 68
-
deepseek-ai/DeepSeek-R1-0528
Text Generation • 685B • Updated • 461k • • 2.35k -
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
Text Generation • 8B • Updated • 264k • • 908 -
ByteDance-Seed/BAGEL-7B-MoT
Any-to-Any • 15B • Updated • 1.09k • 1.1k -
ByteDance-Seed/Seed-Coder-8B-Reasoning
Text Generation • 8B • Updated • 1.29k • 137
-
fishaudio/fish-speech-1.5
Text-to-Speech • Updated • 2.26k • 605 -
262
ClearerVoice-Studio (Speech Enhancement, Separation and Extraction)
📈Better AI powered platform to purify your speech signal
-
fishaudio/fish-speech-1.4
Text-to-Speech • Updated • 229 • 451 -
fishaudio/fish-speech-1.2
Text-to-Speech • Updated • 208 • 207
-
deepseek-ai/DeepSeek-V2.5-1210
Text Generation • 236B • Updated • 14.9k • 254 -
infly/OpenCoder-8B-Instruct
Text Generation • 8B • Updated • 1.55k • 195 -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 76.4k • • 1.91k -
deepseek-ai/DeepSeek-Coder-V2-Base
Text Generation • 236B • Updated • 1.29k • 75
-
Qwen/Qwen3-Coder-30B-A3B-Instruct
Text Generation • 31B • Updated • 95.1k • • 377 -
Qwen/Qwen3-30B-A3B-Thinking-2507
Text Generation • 31B • Updated • 32.9k • 176 -
Qwen/Qwen3-Coder-480B-A35B-Instruct
Text Generation • 480B • Updated • 30.4k • • 1.02k -
Qwen/Qwen3-235B-A22B-Instruct-2507
Text Generation • 235B • Updated • 32.3k • • 588
-
MoBA: Mixture of Block Attention for Long-Context LLMs
Paper • 2502.13189 • Published • 17 -
Kimi-Audio Technical Report
Paper • 2504.18425 • Published • 19 -
Kimi-VL Technical Report
Paper • 2504.07491 • Published • 133 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 123
-
Seed-Coder: Let the Code Model Curate Data for Itself
Paper • 2506.03524 • Published • 6 -
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
Paper • 2504.13914 • Published • 4 -
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Paper • 2503.10772 • Published • 19 -
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Paper • 2503.09949 • Published • 5
-
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
Paper • 2504.21801 • Published • 2 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 415 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 68 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 68
-
deepseek-ai/DeepSeek-R1-0528
Text Generation • 685B • Updated • 461k • • 2.35k -
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
Text Generation • 8B • Updated • 264k • • 908 -
ByteDance-Seed/BAGEL-7B-MoT
Any-to-Any • 15B • Updated • 1.09k • 1.1k -
ByteDance-Seed/Seed-Coder-8B-Reasoning
Text Generation • 8B • Updated • 1.29k • 137
text-to-video & image-to-video models released by the Chinese community
-
fishaudio/fish-speech-1.5
Text-to-Speech • Updated • 2.26k • 605 -
262
ClearerVoice-Studio (Speech Enhancement, Separation and Extraction)
📈Better AI powered platform to purify your speech signal
-
fishaudio/fish-speech-1.4
Text-to-Speech • Updated • 229 • 451 -
fishaudio/fish-speech-1.2
Text-to-Speech • Updated • 208 • 207
-
deepseek-ai/DeepSeek-V2.5-1210
Text Generation • 236B • Updated • 14.9k • 254 -
infly/OpenCoder-8B-Instruct
Text Generation • 8B • Updated • 1.55k • 195 -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 76.4k • • 1.91k -
deepseek-ai/DeepSeek-Coder-V2-Base
Text Generation • 236B • Updated • 1.29k • 75