GPT

Toy implementation of gpt2. Because language models are cool.

Due to compute restraints I can not train the full size GPT-2 model. The largest one I could train is a 352M variant (can be run with the train.sh script) and it converged to a loss of 3.1, which is alright.

The total footprint of the code is quite small so it is fairly easy to modifie. train.py exposes a cli to set all the hyperparameters of the model to make it easy to train and iterate on model.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
images		images
.gitignore		.gitignore
README.md		README.md
generate.py		generate.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT

About

Releases

Packages

Languages

JacobAndersson/gpt

Folders and files

Latest commit

History

Repository files navigation

GPT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages