pip install ninja
pip install flash_attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support