Regression Analysis
VI. MULTIPLE REGRESSION
Preface. You can profit from what you already know about simple regression (one predictor) to help understand multiple regression (more than one predictor). Therefore, in this section of the course notes, I mimic the earlier course notes on simple regression. I especially call attention to what is new or different. Essentially, the new or different material results from the manifold ways that the variables can interact. For example, with one predictor, there is only one interaction: Y with X. But with two predictors, Y can interact with X1, Y with X2, Y with X1 and X2 jointly, and X1 with X2. A key to understanding how multiple regression differs from simple regression is the ceteris paribus concept, which will be explained.
Multiple regression tries to explain and/or predict one variable (the response variable) in terms of other variables (the predictor variables) — often called dependent and independent variables, respectively. Multiple regression = regression with more than one predictor.
Ex: RENT = 143.67 + .3875*AREA + 89.93*BATHROOMS explains/predicts monthly apartment rents in terms of the area of the apartment in square feet and the number of bathrooms in the apartment.
A multiple regression model may have either time series data or cross-sectional data.
There is one column of data for the response variable and one column for each predictor variable.
Symbols: Y denotes response, X1, X2, , Xk denote the predictors.
The General Statistical Model applied to Multiple Regression Models
Recall the General Statistical Model:
ACTUAL = FIT + RESIDUAL
(The same general form for random sample, random walk, and simple regression.)
Applied to multiple regression models:
Actual is the value of the response: Y
Fit is a linear combination of the predictors, called the predicted value or :
= a + b1*X1 + b2*X2 + … + bk* Xk
This is also called the regression equation.
Residual is the difference: Y –
ACTUAL = a + b1*X1 + b2*X2 + … + bk* Xk + RESIDUAL
The Fit is the value that the model says the Actual should be. Regression allows the Fit to vary from one observation to another, depending upon the particular characteristics of the observation.
The Assumptions of the Multiple Regression Model [Ref: Albright 12.2]
Before