Arxiver-Llama-GGUF
GGUF format of Arxiver-Llama for llama.cpp inference.
Available Versions
Arxiver-Llama-8.0B-F32.gguf
(30GB) - 32-bit float, highest precisionArxiver-Llama-8.0B-F16.gguf
(15GB) - 16-bit float, good balanceArxiver-Llama-8.0B-BF16.gguf
(15GB) - Brain float 16, alternative to F16Arxiver-Llama-8.0B-Q8_0.gguf
(4.9GB) - 8-bit quantized, memory efficient
License
This model is licensed under the MIT License.
Citation
If you use this model in your work, please cite it as:
@misc{Arxiver-Llama-GGUF,
author = {real-jiakai},
title = {Arxiver-Llama-GGUF},
year = 2024,
url = {https://huggingface.co/real-jiakai/Arxiver-Llama-GGUF}
publisher = {Hugging Face}
}
- Downloads last month
- 102
Hardware compatibility
Log In
to view the estimation
8-bit
16-bit
32-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for real-jiakai/Arxiver-Llama-GGUF
Base model
meta-llama/Meta-Llama-3-8B-Instruct
Finetuned
shenzhi-wang/Llama3-8B-Chinese-Chat