Spaces:
Running
on
A10G
Has anyone ever gotten this to work?
Do I just have bad luck? I've tried a bunch of repos (most recently THUDM/SWE-Dev-9B) and have always had it error out at some point.
Well I reported here exactly when the error happens and also wrote that it worked in the past.
https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/158
But people keep opening new discussions or making new comments instead of voting it up so this place became a mess. Like you for example don't even tell us what your error is so I have to guess that it's the same I reported already.
I guess the project is abandoned if it wasn't fixed by now.
For those who need features like local Windows support, lower-bit IQ quants, and a download-before-upload workflow, I've created an enhanced fork of this script.
You can find it here: https://huggingface.co/spaces/Fentible/gguf-repo-suite
Clone the repo to your own HF Space or locally using the Quick Start guides.
I could not get it to work on free HF spaces but it might be possible with a rented space. I tested on Windows 10 and made some quants for gemma 3 abliterated by mlabonne.
The bug: ggml-rpc.dll
is very finnicky and it may require you to compile your own version of llama-imatrix
to fix.
Offline needed for 27B+
Worked fine for me, I now have a Q8_0 copy of Pixtral 12B Lumimaid.
From https://huggingface.co/mrcuddle/Lumimaid-v0.2-12B-Pixtral to https://huggingface.co/Koitenshin/Lumimaid-v0.2-12B-Pixtral-Q8_0-GGUF
Did every quant option available using this space. Now available at https://huggingface.co/Koitenshin/Lumimaid_VISION-v0.2-12B-Pixtral-GGUF
in just a couple minutes. No mucking about with setting up my own environments, compiling llama.cpp, etc.
Another attempt, another failure...
Error converting to fp16: INFO:hf-to-gguf:Loading model: granite-vision-3.3-2b-embedding
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: GraniteForCausalLM
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors'
Traceback (most recent call last):
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8595, in <module>
main()
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8589, in main
model_instance.write()
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 410, in write
self.prepare_tensors()
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2126, in prepare_tensors
super().prepare_tensors()
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 277, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2036, in modify_tensors
n_head = self.hparams["num_attention_heads"]
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'num_attention_heads'