Arxiver-Llama-GGUF

GGUF format of Arxiver-Llama for llama.cpp inference.

Available Versions

  • Arxiver-Llama-8.0B-F32.gguf (30GB) - 32-bit float, highest precision
  • Arxiver-Llama-8.0B-F16.gguf (15GB) - 16-bit float, good balance
  • Arxiver-Llama-8.0B-BF16.gguf (15GB) - Brain float 16, alternative to F16
  • Arxiver-Llama-8.0B-Q8_0.gguf (4.9GB) - 8-bit quantized, memory efficient

License

This model is licensed under the MIT License.

Citation

If you use this model in your work, please cite it as:

@misc{Arxiver-Llama-GGUF,
  author = {real-jiakai},
  title = {Arxiver-Llama-GGUF},
  year = 2024,
  url = {https://huggingface.co/real-jiakai/Arxiver-Llama-GGUF}
  publisher = {Hugging Face}
}
Downloads last month
102
GGUF
Model size
8.03B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for real-jiakai/Arxiver-Llama-GGUF