This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; neural networks and deep learning; survival models; multiple testing. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).
This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data science. Computing is done in Python. There are lectures devoted to Python, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.
The lectures cover all the material in An Introduction to Statistical Learning, with Applications in Python by James, Witten, Hastie, Tibshirani and Taylor (Springer, 2023). The pdf for this book is available for free on the book website.
What You’ll Learn:
- Overview of statistical learning
- Linear regression
- Classification
- Resampling methods
- Linear model selection and regularization
- Moving beyond linearity
- Tree-based methods
- Support vector machines
- Deep learning
- Survival modeling
- Unsupervised learning
- Multiple testing
Course Features
- Lectures 108
- Quiz 0
- Duration 25 hours
- Skill level Intermediate
- Language English
- Students 25
- Assessments Yes
Curriculum
- 13 Sections
- 108 Lessons
- 10 Weeks
Expand all sectionsCollapse all sections
- Statistical Learning with Python4
- Regression Models8
- 2.1Statistical Learning: 2.1 Introduction to Regression Models
- 2.2Statistical Learning: 2.2 Dimensionality and Structured Models
- 2.3Statistical Learning: 2.3 Model Selection and Bias Variance Tradeoff
- 2.4Statistical Learning: 2.4 Classification
- 2.5Statistical Learning: 2.Py Setting Up Python I 2023
- 2.6Statistical Learning: 2.Py Data Types, Arrays, and Basics I 2023
- 2.7Statistical Learning: 2.Py.3 Graphics I 2023
- 2.8Statistical Learning: 2.Py Indexing and Dataframes I 2023
- Linear Regression8
- 3.1Statistical Learning: 3.1 Simple linear regression
- 3.2Statistical Learning: 3.2 Hypothesis Testing and Confidence Intervals
- 3.3Statistical Learning: 3.3 Multiple Linear Regression
- 3.4Statistical Learning: 3.4 Some important questions
- 3.5Statistical Learning: 3.5 Extensions of the Linear Model
- 3.6Statistical Learning: 3.Py Linear Regression and statsmodels Package I 2023
- 3.7Statistical Learning: 3.Py Multiple Linear Regression Package I 2023
- 3.8Statistical Learning: 3.Py Interactions, Qualitative Predictors and Other Details I 2023
- Classification Problems12
- 4.1Statistical Learning: 4.1 Introduction to Classification Problems
- 4.2Statistical Learning: 4.2 Logistic Regression
- 4.3Statistical Learning: 4.3 Multivariate Logistic Regression
- 4.4Statistical Learning: 4.4 Logistic Regression Case Control Sampling and Multiclass
- 4.5Statistical Learning: 4.5 Discriminant Analysis
- 4.6Statistical Learning: 4.6 Gaussian Discriminant Analysis (One Variable)
- 4.7Statistical Learning: 4.7 Gaussian Discriminant Analysis (Many Variables)
- 4.8Statistical Learning: 4.8 Generalized Linear Models
- 4.9Statistical Learning: 4.9 Quadratic Discriminant Analysis and Naive Bayes
- 4.10Statistical Learning: 4.Py Logistic Regression I 2023
- 4.11Statistical Learning: 4.Py Linear Discriminant Analysis (LDA) I 2023
- 4.12Statistical Learning: 4.Py K-Nearest Neighbors (KNN) I 2023
- Cross Validation7
- 5.1Statistical Learning: 5.1 Cross Validation
- 5.2Statistical Learning: 5.2 K-fold Cross Validation
- 5.3Statistical Learning: 5.3 Cross Validation the wrong and right way
- 5.4Statistical Learning: 5.4 The Bootstrap
- 5.5Statistical Learning: 5.5 More on the Bootstrap
- 5.6Statistical Learning: 5.Py Cross-Validation I 2023
- 5.7Statistical Learning: 5.Py Bootstrap I 2023
- Best Subset Selection12
- 6.1Statistical Learning: 6.1 Introduction and Best Subset Selection
- 6.2Statistical Learning: 6.2 Stepwise Selection
- 6.3Statistical Learning: 6.3 Backward stepwise selection
- 6.4Statistical Learning: 6.4 Estimating test error
- 6.5Statistical Learning: 6.5 Validation and cross validation
- 6.6Statistical Learning: 6.6 Shrinkage methods and ridge regression
- 6.7Statistical Learning: 6.7 The Lasso
- 6.8Statistical Learning: 6.8 Tuning parameter selection
- 6.9Statistical Learning: 6.9 Dimension Reduction Methods
- 6.10Statistical Learning: 6.10 Principal Components Regression and Partial Least Squares
- 6.11Statistical Learning: 6.Py Stepwise Regression I 2023
- 6.12Statistical Learning: 6.Py Ridge Regression and the Lasso I 2023
- Polynomials and Step Functions7
- 7.1Statistical Learning: 7.1 Polynomials and Step Functions
- 7.2Statistical Learning: 7.2 Piecewise Polynomials and Splines
- 7.3Statistical Learning: 7.3 Smoothing Splines
- 7.4Statistical Learning: 7.4 Generalized Additive Models and Local Regression
- 7.5Statistical Learning: 7.Py Polynomial Regressions and Step Functions I 2023
- 7.6Statistical Learning: 7.Py Splines I 2023
- 7.7Statistical Learning: 7.Py Generalized Additive Models (GAMs) I 2023
- Tree based methods7
- 9.1Statistical Learning: 8.1 Tree based methods
- 9.2Statistical Learning: 8.2 More details on Trees
- 9.3Statistical Learning: 8.3 Classification Trees
- 9.4Statistical Learning: 8.4 Bagging
- 9.5Statistical Learning: 8.5 Boosting
- 9.6Statistical Learning: 8.6 Bayesian Additive Regression Trees
- 9.7Statistical Learning: 8.Py Tree-Based Methods I 2023
- Optimal Separating Hyperplane6
- 10.1Statistical Learning: 9.1 Optimal Separating Hyperplane
- 10.2Statistical Learning: 9.2.Support Vector Classifier
- 10.3Statistical Learning: 9.3 Feature Expansion and the SVM
- 10.4Statistical Learning: 9.4 Example and Comparison with Logistic Regression
- 10.5Statistical Learning: 9.Py Support Vector Machines I 2023
- 10.6Statistical Learning: 9.Py ROC Curves I 2023
- Neural Networks11
- 11.1Statistical Learning: 10.1 Introduction to Neural Networks
- 11.2Statistical Learning: 10.2 Convolutional Neural Networks
- 11.3Statistical Learning: 10.3 Document Classification
- 11.4Statistical Learning: 10.4 Recurrent Neural Networks
- 11.5Statistical Learning: 10.5 Time Series Forecasting
- 11.6Statistical Learning: 10.6 Fitting Neural Networks
- 11.7Statistical Learning: 10.7 Interpolation and Double Descent
- 11.8Statistical Learning: 10.Py Single Layer Model: Hitters Data I 2023
- 11.9Statistical Learning: 10.Py Multilayer Model: MNIST Digit Data I 2023
- 11.10Statistical Learning: 10.Py Convolutional Neural Network: CIFAR Image Data I 2023
- 11.11Statistical Learning: 10.Py Document Classification and Recurrent Neural Networks I 2023
- Survival Data and Censoring6
- 12.1Statistical Learning: 11.1 Introduction to Survival Data and Censoring
- 12.2Statistical Learning: 11.2 Proportional Hazards Model
- 12.3Statistical Learning: 11.3 Estimation of Cox Model with Examples
- 12.4Statistical Learning: 11.4 Model Evaluation and Further Topics
- 12.5Statistical Learning: 11.Py Cox Model: Brain Cancer Data I 2023
- 12.6Statistical Learning: 11.Py Cox Model: Publication Data I 2023
- Principal Components9
- 13.1Statistical Learning: 12.1 Principal Components
- 13.2Statistical Learning: 12.2 Higher order principal components
- 13.3Statistical Learning: 12.3 k means Clustering
- 13.4Statistical Learning: 12.4 Hierarchical Clustering
- 13.5Statistical Learning: 12.5 Matrix Completion
- 13.6Statistical Learning: 12.6 Breast Cancer Example
- 13.7Statistical Learning: 12.Py Principal Components I 2023
- 13.8Statistical Learning: 12.Py Clustering I 2023
- 13.9Statistical Learning: 12.Py Application: NCI60 Data I 2023
- Hypothesis Testing11
- 14.1Statistical Learning: 13.1 Introduction to Hypothesis Testing
- 14.2Statistical Learning: 13.1 Introduction to Hypothesis Testing II
- 14.3Statistical Learning: 13.2 Introduction to Multiple Testing and Family Wise Error Rate
- 14.4Statistical Learning: 13.3 Bonferroni Method for Controlling FWER
- 14.5Statistical Learning: 13.4 Holm’s Method for Controlling FWER
- 14.6Statistical Learning: 13.5 False Discovery Rate and Benjamini Hochberg Method
- 14.7Statistical Learning: 13.6 Resampling Approaches
- 14.8Statistical Learning: 13.6 Resampling Approaches II
- 14.9Statistical Learning: 13.Py Multiple Testing I 2023
- 14.10Statistical Learning: 13.Py False Discovery Rate I 2023
- 14.11Statistical Learning: 13.Py Multiple Testing and Resampling I 2023





