Skip to content

buaiml/transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Transformer

Let's write our own Transformer model from scratch using PyTorch.

GPT2

GPT-2 is a family of models released by OpenAI in 2019. At the smallest end, there is a 124M parameter model, and at the largest end, there is a 1.558B parameter model. These models are trained on a large corpus of text data and can generate human-like text.

In this repository, we are going to try to replace the 124M parameter model, which has:

  • 12 Transformer blocks
  • 768 Dimensions (the size of a vector representation of each token)

This repository is modelled after the hugging face take on gpt-2, which is written in PyTorch instead of the original TensorFlow.

Prerequisites

Installation

  1. Fork the repository
  2. Clone the repository
git clone https://github.com/<your-username>/transformer.git
  1. Open train_gpt2.py

You will need tiktoken for tokenizing the text. You can install it using pip:

pip install tiktoken

Credits

About

Writing our our PyTorch transformer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages