real-jiakai
/

Arxiver-Llama-GGUF

Model card Files Files and versions Community

Arxiver-Llama-GGUF

GGUF format of Arxiver-Llama for llama.cpp inference.

Available Versions

Arxiver-Llama-8.0B-F32.gguf (30GB) - 32-bit float, highest precision
Arxiver-Llama-8.0B-F16.gguf (15GB) - 16-bit float, good balance
Arxiver-Llama-8.0B-BF16.gguf (15GB) - Brain float 16, alternative to F16
Arxiver-Llama-8.0B-Q8_0.gguf (4.9GB) - 8-bit quantized, memory efficient

License

This model is licensed under the MIT License.

Citation

If you use this model in your work, please cite it as:

@misc{Arxiver-Llama-GGUF,
  author = {real-jiakai},
  title = {Arxiver-Llama-GGUF},
  year = 2024,
  url = {https://huggingface.co/real-jiakai/Arxiver-Llama-GGUF}
  publisher = {Hugging Face}
}

Downloads last month: 102

GGUF

Model size

8.03B params

Architecture

llama

Hardware compatibility

Log In to view the estimation

8-bit

16-bit

32-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for real-jiakai/Arxiver-Llama-GGUF

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Finetuned

shenzhi-wang/Llama3-8B-Chinese-Chat

Quantized

(19)

this model