dslighfdsl/Qwen2.5-7B-Instruct-Baseline-SFT-webshop-DPO Text Generation • 8B • Updated 20 days ago • 4
dslighfdsl/Qwen2.5-7B-Instruct-Baseline-SFT-sciworld-DPO Text Generation • 8B • Updated 20 days ago • 4
dslighfdsl/Llama-3.1-8B-Instruct-Baselines-MPO-meta-planner-webshop Text Generation • 8B • Updated Jun 24 • 4
dslighfdsl/Llama-3.1-8B-Instruct-Baselines-SFT-sciworld-ETO Text Generation • 8B • Updated Jun 23 • 4
dslighfdsl/Llama-3.1-8B-Instruct-Baselines-SFT-sciworld-DPO Text Generation • 8B • Updated Jun 23 • 4
dslighfdsl/Llama-3.1-8B-Instruct-Baselines-SFT-alfworld-DPO Text Generation • 8B • Updated Jun 22 • 3
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld-stage3 Text Generation • 8B • Updated Jun 19 • 3
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld-stage3_2 Text Generation • 8B • Updated Jun 19 • 4
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld-stage1 Text Generation • 8B • Updated Jun 17 • 4
dslighfdsl/Llama-3.1-8B-Instruct-SFT-CoT-short-full-3-alfworld-stage2 Text Generation • 8B • Updated Jun 17 • 4