wiki-llm/Building an LLM.md at 29a9805fd8edb8cf4f5e7a8fd61bef18243cd058

Rohit Pai 9315dc352b Created a basic model and ran it for 10 epochs. Used the GPT2 Tokenizer and the base GPT2 model to tokenize and train the retrain the model

Signed-off-by: rodude123 <rodude123@gmail.com>

2026-04-12 11:25:38 +01:00

4.8 KiB

Raw Blame History

🧠 LLM Mini Project — Step-by-Step Checklist

📦 0. Setup Environment

🔍 1. Understand the Problem (don’t skip this)

📚 2. Load Dataset

🔢 3. Tokenization

🧱 4. Prepare Training Data

🤖 5. Load Model

🔁 6. Build Training Loop (core understanding)

📉 7. Observe Training Behaviour

🧪 8. Evaluate Model

⚖️ 9. Try LoRA Fine-Tuning

🧠 10. Understand Convergence

⚙️ 11. Model Saving & Loading

🚀 PART 2 — Infrastructure & Serving

🧠 12. Understand Inference Flow

⚡ 13. Optimize Inference

🧮 14. Apply Quantization

🖥️ 15. Simulate Real-World Usage

☁️ 16. Understand Infra Concepts

🧬 17. (Bonus) DICOM Exploration

✍️ 18. Write Your Blog

Structure

✅ Final Deliverables

⚠️ Keep Yourself Honest

4.8 KiB Raw Blame History Unescape Escape

🧠 LLM Mini Project — Step-by-Step Checklist

📦 0. Setup Environment

🔍 1. Understand the Problem (don’t skip this)

📚 2. Load Dataset

🔢 3. Tokenization

🧱 4. Prepare Training Data

🤖 5. Load Model

🔁 6. Build Training Loop (core understanding)

📉 7. Observe Training Behaviour

🧪 8. Evaluate Model

⚖️ 9. Try LoRA Fine-Tuning

🧠 10. Understand Convergence

⚙️ 11. Model Saving & Loading

🚀 PART 2 — Infrastructure & Serving

🧠 12. Understand Inference Flow

⚡ 13. Optimize Inference

🧮 14. Apply Quantization

🖥️ 15. Simulate Real-World Usage

☁️ 16. Understand Infra Concepts

🧬 17. (Bonus) DICOM Exploration

✍️ 18. Write Your Blog

Structure

✅ Final Deliverables

⚠️ Keep Yourself Honest

4.8 KiB

Raw Blame History