Can not use HF transformers for inference?

#11
by haili-tian - opened

In your model card, the recommended inference engines are vLLM and mistral-inference, and HF transformers is not included.

Is it to say, the HF transformer can not be used to infer this model and at least can not fully demonstrated its features?
I.g., for interleaved sliding-window attention, NO impl. can be found in latest transformers.

For anyone looking up for this, interleaved attention is being implemented for Mistral in transformers here: https://github.com/huggingface/transformers/pull/39799
Probably will be in v4.55

Sign up or log in to comment