23 278 90

Eni Grand

Enigrand

AI & ML interests

None yet

Recent Activity

new activity about 12 hours ago

kernels-community/vllm-flash-attn3:Support for sm120?

new activity 4 days ago

Qwen/Qwen3-32B:Will Qwen3-32B be updated just like Qwen3-235B-A22B?

upvoted a paper 5 days ago

Group Sequence Policy Optimization

View all activity

Organizations

New activity in kernels-community/vllm-flash-attn3 about 12 hours ago

Support for sm120?

#2 opened about 12 hours ago by

Enigrand

New activity in Qwen/Qwen3-32B 4 days ago

Will Qwen3-32B be updated just like Qwen3-235B-A22B?

#40 opened 4 days ago by

Enigrand

upvoted 4 papers 5 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 13 days ago • 267

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 207

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published 18 days ago • 122

MetaCLIP 2: A Worldwide Scaling Recipe

Paper • 2507.22062 • Published 8 days ago • 22

New activity in tiiuae/Falcon-H1-34B-Instruct 6 days ago

Hi, the licence of this model should be apache 2.0 according to the blog.

#11 opened 6 days ago by

Enigrand

liked a model 6 days ago

deepcogito/cogito-v2-preview-llama-109B-MoE

Image-Text-to-Text • 109B • Updated 6 days ago • 1.05k • 24

upvoted 2 papers 7 days ago

Geometric-Mean Policy Optimization

Paper • 2507.20673 • Published 9 days ago • 30

Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful

Paper • 2507.07101 • Published 28 days ago • 3

liked a model 9 days ago

Wan-AI/Wan2.2-TI2V-5B-Diffusers

Text-to-Video • Updated 9 days ago • 16.8k • 56

liked 4 models 10 days ago

liked 3 models 15 days ago

OmniSVG/OmniSVG

Text Generation • Updated 16 days ago • 3.34k • 142

Qwen/Qwen3-1.7B

Text Generation • 2B • Updated 11 days ago • 1.44M • • 222

ibm-ai-platform/micro-g3.3-8b-instruct-1b

1B • Updated Jun 30 • 67.8k • 6

liked a model 18 days ago

MetaStoneTec/MetaStone-S1-32B

33B • Updated about 1 month ago • 35 • 24

commented a paper 18 days ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published 27 days ago • 31 •

Eni Grand

AI & ML interests

Recent Activity

Organizations

Enigrand's activity

Support for sm120?

Will Qwen3-32B be updated just like Qwen3-235B-A22B?

Hi, the licence of this model should be apache 2.0 according to the blog.