deepseek-ai
/

DeepSeek-R1-Distill-Llama-8B

Text Generation

text-generation-inference

Model card Files Files and versions

[Possible bug] Tokenizer removes thinking part

#31

by haritzpuerto - opened 28 days ago

28 days ago

Hi, I just noticed that when you tokenize and apply the chat template to a conversation with a thinking part, the tokenizer removes the thinking part and keeps only the final answer. I think this is not the expected behavior (why removing the thinking part?)

Minimal reproducible example
https://colab.research.google.com/drive/1VAU_XIxaAdooXQ_DpoOL0cx-MGNV1ZgN?usp=sharing

I believe the problem comes from this line in the chat template.

Is this a bug? If not, how to keep the reasoning traces (i.e., the thinking part)? Thanks!

28 days ago

@mgubri thinks this behavior might be for efficiency in multi-turn conversations. However, in my case, I am not creating a multi-turn conversation. I want to create a fine-tuning scenario following the original chat template, so I need to keep the reasoning traces. Hence, I think it would be useful to have a parameter to keep the reasoning traces.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment