Skip to content

Commit

Permalink
Merge pull request #586 from gordicaleksa/readme_debug_note
Browse files Browse the repository at this point in the history
Add a debugging tip to README
  • Loading branch information
karpathy authored Jun 13, 2024
2 parents 95cef79 + 2a6797b commit afcc0a4
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ LLMs in simple, pure C/CUDA with no need for 245MB of PyTorch or 107MB of cPytho

The best introduction to the llm.c repo today is reproducing the GPT-2 (124M) model. [Discussion #481](https://github.com/karpathy/llm.c/discussions/481) steps through this in detail. We can reproduce other models from the GPT-2 and GPT-3 series in both llm.c and in the parallel implementation of PyTorch. Have a look at the [scripts README](scripts/README.md).

debugging tip: when you run the `make` command to build the binary, modify it by replacing `-O3` with `-g` so you can step through the code in your favorite IDE (e.g. vscode).

## quick start (1 GPU, fp32 only)

If you won't be training on multiple nodes, aren't interested in mixed precision, and are interested in learning CUDA, the fp32 (legacy) files might be of interest to you. These are files that were "checkpointed" early in the history of llm.c and frozen in time. They are simpler, more portable, and possibly easier to understand. Run the 1 GPU, fp32 code like this:
Expand Down

0 comments on commit afcc0a4

Please sign in to comment.