Can not use HF transformers for inference?

#11

by haili-tian - opened Oct 28, 2024

Oct 28, 2024

In your model card, the recommended inference engines are vLLM and mistral-inference, and HF transformers is not included.

Is it to say, the HF transformer can not be used to infer this model and at least can not fully demonstrated its features?
I.g., for interleaved sliding-window attention, NO impl. can be found in latest transformers.

manueldeprada

2 days ago

For anyone looking up for this, interleaved attention is being implemented for Mistral in transformers here: https://github.com/huggingface/transformers/pull/39799
Probably will be in v4.55

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment