kernels-community

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

pcuenq new activity about 3 hours ago

kernels-community/vllm-flash-attn3:Not able to find the compatible kernel

marcsun13 published a model about 4 hours ago

kernels-community/triton_kernels

pcuenq new activity about 13 hours ago

kernels-community/vllm-flash-attn3:attention sinks & backward

View all activity

pcuenq

in kernels-community/vllm-flash-attn3 about 3 hours ago

Not able to find the compatible kernel

#4 opened about 10 hours ago by

rom7

marcsun13

published a model about 4 hours ago

kernels-community/triton_kernels

Updated 1 day ago • 1

pcuenq

in kernels-community/vllm-flash-attn3 about 13 hours ago

attention sinks & backward

#3 opened 1 day ago by

acforvs

Support for sm120?

#2 opened 1 day ago by

Enigrand

drbh

updated a model 1 day ago

kernels-community/flash-attn

Updated 28 days ago • 11

pcuenq

published a model 1 day ago

kernels-community/vllm-flash-attn3

Updated 1 day ago • 10

marcsun13

updated a model 1 day ago

kernels-community/triton_kernels

Updated 1 day ago • 1

pcuenq

in kernels-community/vllm-flash-attn3 1 day ago

README for vllm-flash-attn3

#1 opened 1 day ago by

pcuenq

drbh

updated 2 models 1 day ago

kernels-community/activation

Updated 15 days ago • 4

kernels-community/adam-atan2

Updated 6 days ago

pcuenq

updated a model 1 day ago

kernels-community/vllm-flash-attn3

Updated 1 day ago • 10

drbh

published a model 6 days ago

kernels-community/adam-atan2

Updated 6 days ago

drbh

updated a model 7 days ago

kernels-community/mamba-ssm

Updated 7 days ago

danieldk

posted an update 22 days ago

Post

1957

kernels 0.8.0 is out: https://github.com/huggingface/kernels/releases/tag/v0.8.0

This release refines kernel selection in the kernelize function:

• You can now register kernels for certain CUDA capability ranges.
• Rather than doing exact mating of modes, fall back to other compatible modes. If you are kernelizing for inference, but you only registered a training + torch.compile kernel, it will use that kernel since it is compatible with inference as well.

1 reply

danieldk

posted an update 26 days ago

Post

419

You can get flash-attention 3 ⚡️ directly from the hub now using kernels!

kernels-community/flash-attn3

danieldk

posted an update 26 days ago

Post

350

Kernels 0.7.0 is out: https://github.com/huggingface/kernels/releases/tag/v0.7.0 🚀

This release makes it possible to register multiple kernels for a layer. Do you have a super-fast kernel for inference and another kernel for training? Register them both and kernelize will pick the kernel depending on whether you are going to do training or inference.

reach-vb

posted an update about 2 months ago

Post

3671

Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub 🤯

Starting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! 💥

Go, play with it today: https://huggingface.co/blog/inference-providers-featherless

P.S. They're also bringing on more GPUs to support all your concurrent requests!

Narsil

posted an update about 2 months ago

Post

1836

Me: This function is too slow. Find a faster algorithm.
Cursor: Hold my beer.

Me: *Slacking off with colleagues*
Cursor: Ping.

Me: 🤯

danieldk

posted an update 2 months ago

Post

1783

We have been working on a project called kernels. kernels makes it possible to load compute kernels directly from the Hub! 🚀

We plan to give kernels a more proper introduction soon. But for those who have been following along, we are happy to announce a new release:

- New layer API with torch.compile support.
- Experimental support for loading Apple Silicon Metal 🤘 Kernels.
- Generate wheels from Hub kernels for legacy deployments.

Full release notes here: https://github.com/huggingface/kernels/releases/tag/v0.6.0

reach-vb

posted an update 3 months ago

Post

4209

hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! 💥

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! 🤗

16 replies

AI & ML interests

Recent Activity

Team members 15