Disable thinking mode?

#3
by daaain - opened

Is there a special token to disable thinking? I'm using the MLX version if that matters

I'm sorry, I'm useless to you since I don't use MLX and can't run this yes... but I wanted to say thank you for making me spit my coffee out laughing at what looked like a request for a "Disabled thinking mode."

Yes, please check our chat template.

daaain changed discussion title from Disabled thinking mode? to Disable thinking mode?

Thanks, so if I understand correctly, either write /nothink or use enable_thinking in the template if the inference library supports it?

https://huggingface.co/zai-org/GLM-4.5-Air/blob/main/chat_template.jinja#L47

@AbyssianOne haha, the irony of being too autistic to notice 😅 or maybe just the temporary disability of being too tired...

yes, vLLM and sglang supoort enable_thinking params,check our github

Thanks a lot! I'm GPU poor, so only llama.cpp and mlx-lm (via LM Studio currently) for me 😅

But also have to say this model is an absolute sweet spot for people with more powerful Macs, I'm getting 20 tokens / sec on my M2 Max laptop with the 4bit quant, so really grateful for your work!

当我用“”标签测试GLM4.5时偶然发现它又关闭思考模式的效果,我们知道如果把这个标签输入给DeepSeek或Qwen的思考模型时模型往往会输出奇怪的东西。

When I was testing GLM4.5 with the "" tag, I accidentally discovered that it turned off the thinking mode. We know that if this tag is input into DeepSeek or Qwen's thinking model, the model will often output strange stuff.

Sign up or log in to comment