Introduction to Statistical Learning - PISE
Ca’ Foscari University of Venice
This unit will cover the following topics:
PCA produces a low-dimensional representation of a dataset
It finds a sequence of linear combinations of the variables that have maximal variance and are mutually uncorrelated
Apart from producing derived variables for use in supervised learning problems, PCA also serves as a tool for data visualization
The first principal component of features (X_1, X_2, …, X_p) is the normalized linear combination: [ Z_1 = {11}X_1 + {21}X_2 + + _{p1}X_p ] that has the largest variance
Normalization constraint: [ {j=1}^{p} {j1}^2 = 1 ]
The coefficients ({11}, …, {p1}) are called the loadings
These form the principal component loading vector: [ 1 = ({11}, {21}, …, {p1})^T ]
Constraint ensures loadings do not grow arbitrarily large, preventing artificially large variance
Video SL 3.1 Simple Linear Regression - 13:02
Video SL 3.3 Multiple Linear Regression - 15:38
Video SL 3.4 Some Important Questions - 14:52
Video SL 3.5 Extensions of the Linear Model - 14:17
Video SL 6.1 Introduction and Best Subset Selection - 13:45
Video SL 6.3 Backward Stepwise Selection - 5:27