Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.03553

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 8 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published Apr 7 • 41
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4 • 18
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3 • 17

Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
Benchmarking LLMs' Swarm intelligence

Paper • 2505.04364 • Published May 7 • 20
Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published May 6 • 23

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 89

收集的感兴趣的AI

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20 • 193
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 106
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 27

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics

about 4 hours ago

End-to-End Goal-Driven Web Navigation

Paper • 1602.02261 • Published Feb 6, 2016
Learning Language Games through Interaction

Paper • 1606.02447 • Published Jun 8, 2016
Naturalizing a Programming Language via Interactive Learning

Paper • 1704.06956 • Published Apr 23, 2017
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018 • 1

LangModels-Advances-2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models

Paper • 2504.04823 • Published Apr 7 • 31
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 63
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 121
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 114
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 139

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 83
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21
Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 50

zjunlp/KnowSelf-Llama3.1-8B-ALFWorld

8B • Updated Feb 22 • 6 • 1
zjunlp/KnowSelf-Llama3.1-8B-WebShop

8B • Updated Feb 22 • 6
zjunlp/KnowSelf-Gemma2-2B-ALFWorld

3B • Updated Feb 22 • 5
zjunlp/KnowSelf-Gemma2-2B-WebShop

3B • Updated Apr 4 • 6

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 8 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 152
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 24

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics

about 4 hours ago

End-to-End Goal-Driven Web Navigation

Paper • 1602.02261 • Published Feb 6, 2016
Learning Language Games through Interaction

Paper • 1606.02447 • Published Jun 8, 2016
Naturalizing a Programming Language via Interactive Learning

Paper • 1704.06956 • Published Apr 23, 2017
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018 • 1

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published Apr 7 • 41
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4 • 18
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published Apr 3 • 17

LangModels-Advances-2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models

Paper • 2504.04823 • Published Apr 7 • 31
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199

Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published Apr 4 • 27
Benchmarking LLMs' Swarm intelligence

Paper • 2505.04364 • Published May 7 • 20
Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published May 6 • 23

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 63
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 121
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 114
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 139

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

Paper • 2503.10615 • Published Mar 13 • 17
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Paper • 2503.10630 • Published Mar 13 • 6
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 35
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published Mar 10 • 89

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 83
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21
Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 50

收集的感兴趣的AI

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20 • 193
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 106
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 27

zjunlp/KnowSelf-Llama3.1-8B-ALFWorld

8B • Updated Feb 22 • 6 • 1
zjunlp/KnowSelf-Llama3.1-8B-WebShop

8B • Updated Feb 22 • 6
zjunlp/KnowSelf-Gemma2-2B-ALFWorld

3B • Updated Feb 22 • 5
zjunlp/KnowSelf-Gemma2-2B-WebShop

3B • Updated Apr 4 • 6

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs