stjefan2
's Collections
Articles
updated
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper
•
2311.00176
•
Published
•
9
Language Models can be Logical Solvers
Paper
•
2311.06158
•
Published
•
23
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal
Language Models
Paper
•
2311.05997
•
Published
•
37
Lumos: Learning Agents with Unified Data, Modular Design, and
Open-Source LLMs
Paper
•
2311.05657
•
Published
•
32
JaxMARL: Multi-Agent RL Environments in JAX
Paper
•
2311.10090
•
Published
•
8
ML-Bench: Large Language Models Leverage Open-source Libraries for
Machine Learning Tasks
Paper
•
2311.09835
•
Published
•
11
Large Language Models for Mathematicians
Paper
•
2312.04556
•
Published
•
13
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Paper
•
2312.11370
•
Published
•
20
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper
•
2401.00935
•
Published
•
18
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
•
2403.04642
•
Published
•
51
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
71
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
•
2501.04519
•
Published
•
283
Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
116
AgentRxiv: Towards Collaborative Autonomous Research
Paper
•
2503.18102
•
Published
•
24
TTRL: Test-Time Reinforcement Learning
Paper
•
2504.16084
•
Published
•
120
Learning Adaptive Parallel Reasoning with Language Models
Paper
•
2504.15466
•
Published
•
43
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
Abilities
Paper
•
2504.16078
•
Published
•
20
FlowReasoner: Reinforcing Query-Level Meta-Agents
Paper
•
2504.15257
•
Published
•
47
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
•
2504.17192
•
Published
•
114
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical
Reasoning Models with OpenMathReasoning dataset
Paper
•
2504.16891
•
Published
•
24
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
Paper
•
2504.16656
•
Published
•
58
Flow-GRPO: Training Flow Matching Models via Online RL
Paper
•
2505.05470
•
Published
•
81
Measuring General Intelligence with Generated Games
Paper
•
2505.07215
•
Published
•
11
Enigmata: Scaling Logical Reasoning in Large Language Models with
Synthetic Verifiable Puzzles
Paper
•
2505.19914
•
Published
•
44
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive
Programming?
Paper
•
2506.11928
•
Published
•
24
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just
Like an Olympiad Team
Paper
•
2506.14234
•
Published
•
40
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm
Engineering
Paper
•
2506.09050
•
Published
•
7
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning
in LLMs
Paper
•
2506.15211
•
Published
•
36
SwarmAgentic: Towards Fully Automated Agentic System Generation via
Swarm Intelligence
Paper
•
2506.15672
•
Published
•
15
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
System from Hypothesis to Verification
Paper
•
2505.16938
•
Published
•
121
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code
Generation
Paper
•
2506.20639
•
Published
•
29
Inverse Reinforcement Learning Meets Large Language Model Post-Training:
Basics, Advances, and Opportunities
Paper
•
2507.13158
•
Published
•
24
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement
Learning
Paper
•
2507.14111
•
Published
•
22