Disable thinking mode?
Is there a special token to disable thinking? I'm using the MLX version if that matters
I'm sorry, I'm useless to you since I don't use MLX and can't run this yes... but I wanted to say thank you for making me spit my coffee out laughing at what looked like a request for a "Disabled thinking mode."
Yes, please check our chat template.
Thanks, so if I understand correctly, either write /nothink
or use enable_thinking
in the template if the inference library supports it?
https://huggingface.co/zai-org/GLM-4.5-Air/blob/main/chat_template.jinja#L47
@AbyssianOne haha, the irony of being too autistic to notice 😅 or maybe just the temporary disability of being too tired...
yes, vLLM and sglang supoort enable_thinking params,check our github
Thanks a lot! I'm GPU poor, so only llama.cpp and mlx-lm (via LM Studio currently) for me 😅
But also have to say this model is an absolute sweet spot for people with more powerful Macs, I'm getting 20 tokens / sec on my M2 Max laptop with the 4bit quant, so really grateful for your work!
当我用“”标签测试GLM4.5时偶然发现它又关闭思考模式的效果,我们知道如果把这个标签输入给DeepSeek或Qwen的思考模型时模型往往会输出奇怪的东西。
When I was testing GLM4.5 with the "" tag, I accidentally discovered that it turned off the thinking mode. We know that if this tag is input into DeepSeek or Qwen's thinking model, the model will often output strange stuff.