Has anyone ever gotten this to work?

#178
by cob05 - opened

Do I just have bad luck? I've tried a bunch of repos (most recently THUDM/SWE-Dev-9B) and have always had it error out at some point.

Well I reported here exactly when the error happens and also wrote that it worked in the past.
https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/158

But people keep opening new discussions or making new comments instead of voting it up so this place became a mess. Like you for example don't even tell us what your error is so I have to guess that it's the same I reported already.

I guess the project is abandoned if it wasn't fixed by now.

For those who need features like local Windows support, lower-bit IQ quants, and a download-before-upload workflow, I've created an enhanced fork of this script.

You can find it here: https://huggingface.co/spaces/Fentible/gguf-repo-suite

Clone the repo to your own HF Space or locally using the Quick Start guides.

I could not get it to work on free HF spaces but it might be possible with a rented space. I tested on Windows 10 and made some quants for gemma 3 abliterated by mlabonne.

The bug: ggml-rpc.dll is very finnicky and it may require you to compile your own version of llama-imatrix to fix.

Offline needed for 27B+

Worked fine for me, I now have a Q8_0 copy of Pixtral 12B Lumimaid.

From https://huggingface.co/mrcuddle/Lumimaid-v0.2-12B-Pixtral to https://huggingface.co/Koitenshin/Lumimaid-v0.2-12B-Pixtral-Q8_0-GGUF

Did every quant option available using this space. Now available at https://huggingface.co/Koitenshin/Lumimaid_VISION-v0.2-12B-Pixtral-GGUF

in just a couple minutes. No mucking about with setting up my own environments, compiling llama.cpp, etc.

Another attempt, another failure...

Error converting to fp16: INFO:hf-to-gguf:Loading model: granite-vision-3.3-2b-embedding
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
 You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: GraniteForCausalLM
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
 You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors'
Traceback (most recent call last):
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8595, in <module>
    main()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8589, in main
    model_instance.write()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 410, in write
    self.prepare_tensors()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2126, in prepare_tensors
    super().prepare_tensors()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 277, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2036, in modify_tensors
    n_head = self.hparams["num_attention_heads"]
             ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'num_attention_heads'

Sign up or log in to comment