Build Your Own GPT from Scratch in a Single Workshop

digital · 2026-05-05

A new hands-on workshop guides participants through building a working GPT model from scratch, writing every component of the training pipeline themselves. Based on Andrej Karpathy's nanoGPT, the project strips down the architecture to a ~10 million parameter model that trains on a laptop in under an hour. Participants build a character-level tokenizer, transformer model (embeddings, attention, feed-forward layers), training loop (forward pass, loss, backprop, optimizer, learning rate scheduling), and text generator capable of producing Shakespeare-like text. No black-box libraries are used; everything is written in Python with PyTorch. The workshop is designed for anyone comfortable reading Python code, with no prior machine learning experience required. Training automatically uses Apple Silicon GPU (MPS), NVIDIA GPU (CUDA), or CPU, and also runs on Google Colab. The project is inspired by Karpathy's nanoGPT, which reproduces GPT-2 (124M parameters) in ~300 lines of PyTorch. Additional references include Karpathy's microgpt (200 lines of pure Python), nanochat (ChatGPT clone), the original 'Attention Is All You Need' paper (2017), the GPT-2 paper (2019), and the TinyStories paper on small models with curated data.

Key facts

Workshop builds a ~10M parameter GPT model from scratch
Trains on a laptop in under an hour
Uses character-level tokenization on Shakespeare (vocab_size=65, block_size=256)
No black-box libraries; all code written in Python with PyTorch
Based on Andrej Karpathy's nanoGPT project
Runs on Apple Silicon GPU (MPS), NVIDIA GPU (CUDA), or CPU
Also works on Google Colab
References include 'Attention Is All You Need' (2017) and GPT-2 paper (2019)

Build Your Own GPT from Scratch in a Single Workshop

Key facts

Entities

Artists

Sources