Update README.md
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ On SWE-Bench Verified, **KAT-Dev-32B** achieves comparable performance with **62
|
|
43 |
</td>
|
44 |
</tr>
|
45 |
<tr>
|
46 |
-
<td><strong>
|
47 |
<td>Scaling agentic RL hinges on three challenges: efficient learning over nonlinear trajectory histories, leveraging intrinsic model signals, and building scalable high-throughput infrastructure. We address these with a multi-level prefix caching mechanism in the RL training engine, an entropy-based trajectory pruning technique, and an inner implementation of SeamlessFlow[1] architecture that cleanly decouples agents from training while exploiting heterogeneous compute. These innovations together cut scaling costs and enable efficient large-scale RL.
|
48 |
</td>
|
49 |
</tr>
|
|
|
43 |
</td>
|
44 |
</tr>
|
45 |
<tr>
|
46 |
+
<td><strong>3. Agentic RL Scaling</strong></td>
|
47 |
<td>Scaling agentic RL hinges on three challenges: efficient learning over nonlinear trajectory histories, leveraging intrinsic model signals, and building scalable high-throughput infrastructure. We address these with a multi-level prefix caching mechanism in the RL training engine, an entropy-based trajectory pruning technique, and an inner implementation of SeamlessFlow[1] architecture that cleanly decouples agents from training while exploiting heterogeneous compute. These innovations together cut scaling costs and enable efficient large-scale RL.
|
48 |
</td>
|
49 |
</tr>
|