Coefficient of Determination
R^2
indicates the proportion of variance in the dependent variable that is predictable from the independent variable
 is very high when the line is fit well

r^2
is the square of the sample correlation coefficient
 r = 0.7, r^2 = 0.49 implying 49% of variability in x is caused by variability in y, so 51% is unaccounted for
Ordinary Least Squares (OLS)
Method for estimates unknown parameters in linear regression
 dataset is n observations {yi, xi}, i from 1 to n, where yi is a scaler value corresponding to some vector x:
 yi = xi^{T}β + εi
 then Y = Xβ + ε; β is p x 1 vector, X is n x p matrix, Y and ε are n x 1 vectors
Generalized Linear Model
for pdimensional fector function yhat(w,x) = w0 + w1x1 + ... + wpxp
, w = (w1,...,wp) are coefficients and w0 is the intercept
 linear regression fits a line (linear model) to minimize the sum of squares between observations
 computes using SVD of X, then for
n
pdimensional vectors (n x p matrix), ordinary least squares is O(np^2)
Ridge Regression
Ridge regression imposes a penalty on Ordinary Least Squares, introducing a ridge coefficient which minimizes the residual sum of squares
 shrinkage is when a fitted relationship performs less well on a new data set
Regression algorithm for highdimensional data
 numerically efficient when p >> n, for n x p matrix or n pdimensional vectors
Logistic Regression
Linear model for classification rather than regression. Can be used for binary, or multinomial logistic regression