thomasjhuang
/

qwen2-rloo-countdown-step250

Text Generation

reinforcement-learning

Model card Files Files and versions Community

qwen2-rloo-countdown-step250

Ctrl+K

Ctrl+K

1 contributor

History: 4 commits

thomasjhuang's picture

Add model card with training details

08251c0 verified 2 months ago

.gitattributes

1.57 kB

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
README.md

2.39 kB

Add model card with training details 2 months ago
added_tokens.json

80 Bytes

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
chat_template.jinja

327 Bytes

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
config.json

684 Bytes

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
generation_config.json

117 Bytes

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
merges.txt

1.67 MB

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
model.safetensors

1.98 GB
LFS

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
special_tokens_map.json

370 Bytes

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
tokenizer.json

11.4 MB
LFS

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
tokenizer_config.json

1.17 kB

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago
vocab.json

2.78 MB

RLOO checkpoint at optimizer step 250 - Fixed prompt format, temp=0.1, lr=3e-6 2 months ago