rStar2-Agent-14B: Advanced Agentic Reasoning Model
Model Description
This is a reproduced version of rStar2-Agent, a 14B parameter math reasoning model that achieves performance comparable to 67B DeepSeek-R1 through pure agentic reinforcement learning. The model excels at planning, reasoning, and autonomously using coding tools to efficiently explore, verify, and reflect for complex problem-solving.
Usage
This is an example usage. To reproduce the math evaluation results in technical report, please refer to @microsoft/rstar.
1. Start SGLang Server
First, serve the model using SGLang with the following command:
python -m sglang.launch_server \
--model-path rstar2-reproduce/rstar2-agent \
--port 30000 \
--tensor-parallel-size 4 \
--tool-call-parser qwen25
Parameters:
--model-path
: Path to the rStar2-Agent model--port
: Server port (default: 30000)--tensor-parallel-size
: Number of GPUs for parallel processing--tool-call-parser
: Parser for tool calls (use "qwen25" for this model)
2. Use with OpenAI-compatible API
from openai import OpenAI
import json
# Initialize OpenAI client pointing to SGLang server
client = OpenAI(
base_url="http://localhost:30000/v1", # SGLang server URL
api_key="EMPTY" # No API key required for local server
)
# Define Python code execution tool for the model
tools = [
{
"type": "function",
"function": {
"name": "execute_python_code_with_standard_io",
"description": "Execute Python code with standard input and capture standard output.\nThis function takes a Python code string and an input string, provides the input string\nthrough standard input (stdin) to the code, and captures and returns any output produced\nthrough standard output (stdout). If the executed code raises an exception, the error\nmessage will be captured and returned instead.",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "A string containing Python code to be executed. The code can read from standard input using the input() function."
},
"input": {
"type": "string",
"description": "A string that will be provided as standard input to the code when it calls input()."
}
},
"required": ["code", "input"]
}
}
}
]
# Define Python code execution function
def execute_python_code_with_standard_io(code, input_data):
"""
Execute Python code with standard input and capture output.
Args:
code (str): Python code to execute
input_data (str): Input data to provide to the code
Returns:
str: Output from the executed code or error message
"""
import subprocess
import sys
try:
# Create subprocess to execute Python code
process = subprocess.Popen(
[sys.executable, "-c", code],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True
)
# Send input and get output
stdout, stderr = process.communicate(input=input_data)
if stderr:
return f"Error: {stderr}"
return stdout.strip()
except Exception as e:
return f"Execution error: {str(e)}"
# Example: Create a math problem conversation
messages = [
{
"role": "user",
"content": "You must put your answer inside <answer> </answer> tags, i.e., <answer> answer here </answer>. And your final answer will be extracted automatically by the \\boxed{} tag. Solve this math problem: Find the sum of all prime numbers less than 20."
}
]
# Main conversation loop - handle tool calls until completion
turn_idx = 0
while True:
print(f'========== Turn: {turn_idx} ==========')
turn_idx += 1
# Get model response with tool support
response = client.chat.completions.create(
model="rstar2-reproduce/rstar2-agent",
messages=messages,
tools=tools,
tool_choice="auto", # Let model decide when to use tools
temperature=0.6 # Adjust for creativity vs consistency
)
# Add the assistant's response to conversation history
messages.append(response.choices[0].message)
print(f'{response.choices[0].message.content}')
# Check if model wants to use tools
if response.choices[0].message.tool_calls:
# Process each tool call
for tool_call in response.choices[0].message.tool_calls:
function_args = json.loads(tool_call.function.arguments)
print(f">>> Executing Code:\n{function_args['code']}")
input_text = function_args.get('input', '')
print(f">>> With Input: {input_text if input_text else '(no input)'}")
# Execute the Python code
result = execute_python_code_with_standard_io(
function_args["code"],
function_args.get("input", "")
)
print(f">>> Tool result: {result}")
# Add tool response to conversation
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
else:
# No more tool calls, conversation finished
print("โ
No more tool calls. Conversation finished.")
break
Citation
If you use this model in your research, please cite:
@misc{shang2025rstar2agentagenticreasoningtechnical,
title={rStar2-Agent: Agentic Reasoning Technical Report},
author={Ning Shang and Yifei Liu and Yi Zhu and Li Lyna Zhang and Weijiang Xu and Xinyu Guan and Buze Zhang and Bingcheng Dong and Xudong Zhou and Bowen Zhang and Ying Xin and Ziming Miao and Scarlett Li and Fan Yang and Mao Yang},
year={2025},
eprint={2508.20722},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.20722},
}
License
This model is released under the MIT License.
- Downloads last month
- 14