YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
qwen2.5-7b-instruct-q4-k-m-gguf
Qwen2.5-7B-Instruct model quantized to Q4_K_M (4-bit, medium quality)
Quick Start
- Download the model:
wget https://huggingface.co/your-username/qwen2.5-7b-instruct-q4-k-m-gguf/resolve/main/qwen2.5-7b-instruct-q4-k-m-gguf.gguf
- Run inference:
# With llama.cpp
./main -m qwen2.5-7b-instruct-q4-k-m-gguf.gguf -n 512
# With Python
python -c "
from llama_cpp import Llama
llm = Llama(model_path='./qwen2.5-7b-instruct-q4-k-m-gguf.gguf')
print(llm('Hello!', max_tokens=100)['choices'][0]['text'])
"
Model Information
- Base Model: Qwen2.5-7B-Instruct
- Quantization: Q4_K_M
- File Size: 4.4 GB
- Format: GGUF
Performance
This quantized model provides a good balance between model quality and inference speed.
- Downloads last month
- 30
Hardware compatibility
Log In
to view the estimation
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support