Transformer

Let's write our own Transformer model from scratch using PyTorch.

GPT2

GPT-2 is a family of models released by OpenAI in 2019. At the smallest end, there is a 124M parameter model, and at the largest end, there is a 1.558B parameter model. These models are trained on a large corpus of text data and can generate human-like text.

In this repository, we are going to try to replace the 124M parameter model, which has:

12 Transformer blocks
768 Dimensions (the size of a vector representation of each token)

This repository is modelled after the hugging face take on gpt-2, which is written in PyTorch instead of the original TensorFlow.

Prerequisites

Python 3.8, 3.9, 3.10, 3.11, 3.12
- https://www.python.org/downloads/
PyTorch
- https://pytorch.org/get-started/locally/
- run nvidia-smi to figure out which CUDA version your card supports
Shakespeare's texts
- You can use any text you want, but Shakespeare's texts are included in ./data/shakespeare.txt

Installation

Fork the repository
Clone the repository

git clone https://github.com/<your-username>/transformer.git

Open train_gpt2.py

You will need tiktoken for tokenizing the text. You can install it using pip:

pip install tiktoken

Credits

First hour of: https://youtu.be/l8pRSuU81PU
Intuition behind variable names: https://youtu.be/eMlx5fFNoYc
Intuition behind variable names: https://youtu.be/9-Jl0dxWQs8

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
shakespeare.txt		shakespeare.txt
train_gpt2.py		train_gpt2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transformer

GPT2

Prerequisites

Installation

Credits

About

Uh oh!

Releases

Packages

Languages

buaiml/transformer

Folders and files

Latest commit

History

Repository files navigation

Transformer

GPT2

Prerequisites

Installation

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages