Nemotron H models!
please update your llama.cpp
https://github.com/ggml-org/llama.cpp/pull/15507
then please process as many models as possible from
https://huggingface.co/collections/nvidia/nvidia-nemotron-689f6d6e6ead8e77dd641615
and
https://huggingface.co/collections/nvidia/nemotron-h-67fd3d7ca332cdf1eb5a24bb
have fun! :)
@mradermacher I updated our llama.cpp fork. So queue all the above mentioned models once you updated. The timing for this was super unlucky. We already upgraded llama.cpp just hours before this got merged.
all non-FP8 repos queued (at -1900)
i see, one of them is the broken NVIDIA-Nemotron-Nano-9B-v2 that gave me the most arrogant reply by nvidia that I've ever gotten, and I vowed not to quantize it. Let me do some checks.
yup, they all have the same bug. I will not lift a finger for these assholes, so, no.
Basically, I reported a syntax error in their config.json, and an nvidia guy investigated thoroughly, admitting that it's not a valid json file and then said he doesn't care to fix it. Checking in the fix would have taken them a fraction of the time that quoting the RFC would have taken. Since that means I'd have to fix every single one of them, I am also too lazy.
Sorry.