Building a decoder-only GPT transformer following tutorials
Following the tutorial at brunomaga.github.io/GPT-lite to build a decoder-only GPT transformer and pretrain on various datasets. None of the work is my own other than trivial changes/slight variations of the code. Work in progress.