Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

arxiv: 2309.16609

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

285

Full-text search

Active filters: 2309.16609

Xorbits/Qwen-7B-Chat-GGUF

Text Generation • 8B • Updated Dec 18, 2023 • 69 • 8

Xorbits/Qwen-14B-Chat-GGUF

Text Generation • 14B • Updated Dec 19, 2023 • 23 • 1

rinna/nekomata-7b

Text Generation • 8B • Updated Mar 23 • 1.24k • 7

rinna/nekomata-14b

Text Generation • 14B • Updated Mar 23 • 1.24k • 19

rinna/nekomata-7b-instruction

Text Generation • 8B • Updated Mar 23 • 1.59k • 10

rinna/nekomata-14b-instruction

Text Generation • 14B • Updated Mar 23 • 1.49k • 23

stabilityai/stablelm-2-1_6b

Text Generation • 2B • Updated Jul 10, 2024 • 1.92k • 191

Qwen/Qwen1.5-0.5B

Text Generation • 0.6B • Updated Apr 5, 2024 • 101k • 165

Qwen/Qwen1.5-4B

Text Generation • 4B • Updated Apr 5, 2024 • 16.6k • 35

Qwen/Qwen1.5-14B

Text Generation • 14B • Updated Apr 5, 2024 • 47.6k • 41

afrideva/stablelm-2-1_6b-GGUF

Text Generation • 2B • Updated Jan 22, 2024 • 124 • 3

Qwen/Qwen1.5-72B

Text Generation • 72B • Updated Apr 5, 2024 • 9.61k • 60

Qwen/Qwen1.5-7B-Chat

Text Generation • 8B • Updated Apr 30, 2024 • 34.8k • 176

Qwen/Qwen1.5-14B-Chat

Text Generation • 14B • Updated Apr 30, 2024 • 24.1k • 112

Qwen/Qwen1.5-72B-Chat

Text Generation • 72B • Updated Oct 8, 2024 • 10.6k • 218

Qwen/Qwen1.5-0.5B-Chat

Text Generation • 0.6B • Updated Apr 30, 2024 • 802k • 82

Qwen/Qwen1.5-72B-Chat-AWQ

Text Generation • 12B • Updated Apr 30, 2024 • 1.49k • 24

Qwen/Qwen1.5-14B-Chat-AWQ

Text Generation • 3B • Updated Apr 30, 2024 • 936 • 23

Qwen/Qwen1.5-7B-Chat-AWQ

Text Generation • 2B • Updated Apr 30, 2024 • 1.2k • 13

Qwen/Qwen1.5-4B-Chat-AWQ

Text Generation • 1B • Updated Apr 30, 2024 • 1.51k • 3

Qwen/Qwen1.5-1.8B-Chat-AWQ

Text Generation • 0.8B • Updated Apr 30, 2024 • 62 • 4

Qwen/Qwen1.5-0.5B-Chat-AWQ

Text Generation • 0.4B • Updated Apr 30, 2024 • 132 • 7

Qwen/Qwen1.5-72B-Chat-GGUF

Text Generation • 72B • Updated Apr 9, 2024 • 92 • 64

Qwen/Qwen1.5-7B-Chat-GGUF

Text Generation • 8B • Updated Apr 9, 2024 • 4.83k • 68

Qwen/Qwen1.5-14B-Chat-GGUF

Text Generation • 14B • Updated Apr 9, 2024 • 1.19k • 66

Qwen/Qwen1.5-0.5B-Chat-GGUF

Text Generation • 0.6B • Updated Apr 9, 2024 • 4.34k • 31

Qwen/Qwen1.5-1.8B-Chat-GGUF

Text Generation • 2B • Updated Apr 9, 2024 • 2.5k • 18

Qwen/Qwen1.5-4B-Chat-GGUF

Text Generation • 4B • Updated Apr 9, 2024 • 1.05k • 13

X-D-Lab/MindChat-Qwen2-4B

Text Generation • 4B • Updated Feb 4, 2024 • 7 • 5

X-D-Lab/MindChat-Qwen2-0_5B

Text Generation • 0.6B • Updated Feb 4, 2024 • 68 • 2