Why it helps:

Once the loss is low, how do you know if the model is "smart"? Your PDF should include:

rasbt/LLMs-from-scratch: Implement a ChatGPT-like ... - GitHub