GitHub Link
Browse files
README.md
CHANGED
@@ -51,6 +51,7 @@ GPT-124M is a decoder-only transformer model based on OpenAI’s GPT-2 architect
|
|
51 |
- **Paper:** [Training Compute-Optimal Large Language Models](https://arxiv.org/pdf/2203.15556)
|
52 |
- **Video:** [Andrej Karpathy-Let's reproduce GPT-2 (124M)](https://youtu.be/l8pRSuU81PU?si=KAo1y9dHYQAGJmj5)
|
53 |
- **Demo:** [GPT 124M Demo](https://huggingface.co/spaces/samkeet/GPT_124M)
|
|
|
54 |
|
55 |
## Model Details
|
56 |
|
|
|
51 |
- **Paper:** [Training Compute-Optimal Large Language Models](https://arxiv.org/pdf/2203.15556)
|
52 |
- **Video:** [Andrej Karpathy-Let's reproduce GPT-2 (124M)](https://youtu.be/l8pRSuU81PU?si=KAo1y9dHYQAGJmj5)
|
53 |
- **Demo:** [GPT 124M Demo](https://huggingface.co/spaces/samkeet/GPT_124M)
|
54 |
+
- **GitHub:** [SamkeetSangai/GPT_124M](https://github.com/SamkeetSangai/GPT_124M)
|
55 |
|
56 |
## Model Details
|
57 |
|