osmosis-ai
/

osmosis-mcp-4b

Model card Files Files and versions

AndyGulp commited on May 8

Commit

7215f29

·

verified ·

1 Parent(s): 2b86fc5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ This requires the model to reason through multiple tool invocations (e.g., weath
 Our training pipeline leverages:
-- (**Dr. GRPO**)[https://arxiv.org/abs/2503.20783] for stable and sample-efficient reinforcement learning.
 - **Synthetic multi-step MCP interactions** with strong tool chaining behavior, generated using our internal data engine.
 - **SGLang + VeRL** for efficient multi-turn rollout environments, built on top of Qwen3-4B for its function-calling capabilities.

 Our training pipeline leverages:
+- [**Dr. GRPO**](https://arxiv.org/abs/2503.20783) for stable and sample-efficient reinforcement learning.
 - **Synthetic multi-step MCP interactions** with strong tool chaining behavior, generated using our internal data engine.
 - **SGLang + VeRL** for efficient multi-turn rollout environments, built on top of Qwen3-4B for its function-calling capabilities.