JP van Oosten

NanoGPT

Jan 19, 2023

Everyone is still talking about ChatGPT! I can’t open Twitter or LinkedIn without directly seeing someone talking about it. “🚀 You’ve been using ChatGPT wrong, here are 10 prompts you’ll also never need. Number 3️⃣ will amaze you!” and stuff like that.

🦸
In a previous post, I talked about my reservations regarding ChatGPT, which are mainly due to the closed nature of the OpenAI models. But there’s good news: not all heroes wear capes! Andrej Karpathy (not sure if he wears a cape or not…) has been working on nanoGPT, an open source implementation of the technology behind GPT.

💻
You can train nanoGPT on an expensive machine (either rented or if you happen to own a very expensive computer), or train it on a smaller dataset on your laptop. Of course, if you train your own model on a laptop, it will not be as powerful as the things OpenAI puts out, but it might do the job just fine for some use-cases. The biggest thing though, is that anyone can use this as a basis for their own GPT implementation, trained on their own dataset, with different languages, etc. It becomes reachable for small & medium sized businesses. So, what are you going to build?

🧠
If you want to learn how to build your own GPT-like model, Andrej Karpathy has even published a youtube series on neural networks. Happy hacking!

(Also posted on my LinkedIn feed)