Language Modeling from Scratch
Language models serve as the cornerstone of modern natural language processing (NLP) applications and open up a new paradigm of having a single general purpose system address a range of downstream tasks. As the field of artificial intelligence (AI), machine learning (ML), and NLP continues to grow, possessing a deep understanding of language models becomes essential for scientists and engineers alike. This course is designed to provide students with a comprehensive understanding of language models by walking them through the entire process of developing their own. Drawing inspiration from operating systems courses that create an entire operating system from scratch, we will lead students through every aspect of language model creation, including data collection and cleansing for pre-training, transformer model construction, model training, and evaluation before deployment. Course Website: https://stanford-cs336.github.io/
Course Features
- Lectures 15
- Quiz 0
- Duration 15 hours
- Skill level All levels
- Language English
- Students 7
- Assessments Yes
Curriculum
- 1 Section
- 15 Lessons
- 10 Weeks
- Language Modeling from Scratch15
- 1.1Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization
- 1.2Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lec. 2: Pytorch, Resource Accounting
- 1.3Stanford CS336 Lang. Modeling from Scratch | Spring 2025 | Lec. 3: Architectures, Hyperparameters
- 1.4Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 4: Mixture of experts
- 1.5Stanford CS336 I Language Modeling from Scratch | Spring 2025 | Lecture 5: GPUs
- 1.6Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 6: Kernels, Triton
- 1.7Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 7: Parallelism 1
- 1.8Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 8: Parallelism 2
- 1.9Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 9: Scaling laws 1
- 1.10Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 10: Inference
- 1.11Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 11: Scaling laws 2
- 1.12Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 12: Evaluation
- 1.13Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 13: Data 1
- 1.14Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 14: Data 2
- 1.15Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 15: Alignment – SFT/RLHF






