Arunkumar Venkataramanan's picture

92 209

Arunkumar Venkataramanan

ArunkumarVR

·

https://arunkumarramanan.github.io

AI & ML interests

AGI Research: Reasoning, Safety & Alignment (Superalignment), Generative AI (GenAI), Multi-Modal Foundation Models (FMs), Large Language Models (LLMs), Transformers & Diffusion Models, Open LLM Training, Optimization & Finetuning, Serving & Inference

Recent Activity

liked a model about 1 month ago

xai-org/grok-2

upvoted an article about 2 months ago

Welcome GPT OSS, the new open-source model family from OpenAI!

upvoted a collection about 2 months ago

View all activity

Organizations

upvoted an article about 2 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

By

and 11 others •

Aug 5

• 495

upvoted a collection about 2 months ago

gpt-oss

Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 355

upvoted 4 collections 3 months ago

Kimi-K2

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 3 items • Updated 23 days ago • 125

Reasoning datasets

24 items • Updated May 22 • 5

SmolLM3 evaluation datasets

Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated Jul 8 • 5

SmolLM3 pretraining datasets

datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12 • 31

upvoted an article 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

Jul 8

• 685

upvoted a paper 5 months ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74

upvoted 2 collections 6 months ago

Llama 4

Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated Aug 21 • 49

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 626

upvoted a paper 6 months ago

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26 • 164

upvoted a collection 7 months ago

Google's Gemma models family

328 items • Updated 15 days ago • 508

upvoted an article 7 months ago

Article

Open R1: Update #3

By

and 9 others •

Mar 11

• 295

upvoted 3 collections 7 months ago

Gemma 3 Release

28 items • Updated Aug 11 • 508

QwQ

Qwen with Questions • 6 items • Updated Jul 21 • 98

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with TensorRT Model Optimizer. • 43 items • Updated about 23 hours ago • 35

upvoted a paper 7 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 149

upvoted 3 collections 8 months ago

RLHFlow MATH Process Reward Model

This is a collection of datasets and models of process reward modeling. • 15 items • Updated Nov 9, 2024 • 11

Skywork-o1-Open

Skywork o1 open model collections • 3 items • Updated Jun 12 • 21

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 11 items • Updated Jul 21 • 84