Skip to content

JacobAndersson/gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT

Toy implementation of gpt2. Because language models are cool.

Due to compute restraints I can not train the full size GPT-2 model. The largest one I could train is a 352M variant (can be run with the train.sh script) and it converged to a loss of 3.1, which is alright.

image

The total footprint of the code is quite small so it is fairly easy to modifie. train.py exposes a cli to set all the hyperparameters of the model to make it easy to train and iterate on model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published