Day 13: Multiple Regression in SPSS – Predicting with Multiple Variables
Welcome to Day 13 of your 50-day SPSS learning journey! Today, we’ll explore Multiple Regression, a powerful statistical technique for predicting a dependent variable based on two or more independent variables. This method allows you to build more comprehensive models by incorporating multiple predictors.
What is Multiple Regression?
Multiple Regression extends Simple Linear Regression by including multiple independent variables. The relationship is expressed by the equation:
Y = a + b₁X₁ + b₂X₂ + b₃X₃ + … + e
Where:
- Y = Dependent variable (outcome).
- X₁, X₂, X₃ = Independent variables (predictors).
- a = Intercept (value of Y when all X’s are 0).
- b₁, b₂, b₃ = Regression coefficients (impact of each predictor on Y).
- e = Error term (unexplained variance).
When to Use Multiple Regression?
Use Multiple Regression when:
- You want to predict an outcome using multiple factors.
- All variables are continuous or dichotomous (binary).
- You’ve verified relationships between predictors and the dependent variable.
How to Perform Multiple Regression in SPSS
Step 1: Open Your Dataset
For this example, use the following dataset:
ID | Hours_Studied | Attendance | Test_Score |
---|---|---|---|
1 | 2 | 60 | 50 |
2 | 4 | 70 | 60 |
3 | 6 | 80 | 70 |
4 | 8 | 90 | 80 |
5 | 10 | 95 | 90 |
Test_Score
: Dependent variable (Y).Hours_Studied
andAttendance
: Independent variables (X₁, X₂).
Step 2: Access the Regression Tool
- Go to Analyze > Regression > Linear.
- A dialog box will appear.
Step 3: Select Variables
- Move
Test_Score
to the Dependent box. - Move
Hours_Studied
andAttendance
to the Independent(s) box.
Step 4: Customize Options
- Click Statistics and check:
- Estimates: To see regression coefficients.
- Model Fit: To view R-squared and ANOVA.
- Collinearity Diagnostics: To check for multicollinearity (if predictors are highly correlated).
- Click OK to run the analysis.
Interpreting the Output
The SPSS output includes the following key sections:
1. Model Summary Table
- R-Square: Proportion of variance in the dependent variable explained by the predictors.
- Example: If R-Square = 0.90, 90% of the variation in
Test_Score
is explained byHours_Studied
andAttendance
.
- Example: If R-Square = 0.90, 90% of the variation in
- Adjusted R-Square: Adjusted for the number of predictors in the model.
2. ANOVA Table
- Tests whether the regression model is statistically significant.
- Look at the Sig. value: If p < 0.05, the model is significant.
3. Coefficients Table
- Shows the regression equation:
- Unstandardized Coefficients (B): Used to construct the equation.
- Example:
Test_Score = 10 + 5 * Hours_Studied + 0.3 * Attendance- For every additional hour studied, the test score increases by 5 points.
- For every 1% increase in attendance, the test score increases by 0.3 points.
- Standardized Coefficients (Beta): Compare the relative importance of predictors.
4. Collinearity Diagnostics
- Variance Inflation Factor (VIF): If VIF > 10, multicollinearity may be an issue.
Practice Example: Build a Multiple Regression Model
Use the following dataset:
ID | Income | Education_Level | Work_Experience | Satisfaction |
---|---|---|---|---|
1 | 30000 | 16 | 2 | 7 |
2 | 40000 | 18 | 4 | 8 |
3 | 50000 | 20 | 6 | 9 |
4 | 60000 | 18 | 8 | 8 |
5 | 70000 | 16 | 10 | 9 |
- Perform a Multiple Regression with
Satisfaction
as the dependent variable. - Use
Income
,Education_Level
, andWork_Experience
as predictors. - Interpret the R-squared value, coefficients, and significance levels.
Common Mistakes to Avoid
- Ignoring Multicollinearity: If predictors are highly correlated, it can distort the model. Check VIF values in the output.
- Overfitting: Adding too many predictors can lead to overfitting. Use Adjusted R-Square to evaluate model performance.
- Assuming Causation: Regression shows relationships, not causation. Consider the context of your data.
Key Takeaways
- Multiple Regression predicts a dependent variable using multiple predictors.
- Use R-squared, p-values, and coefficients to assess model fit and interpret relationships.
- Always check for multicollinearity to ensure your model is reliable.
What’s Next?
In Day 14 of your 50-day SPSS learning journey, we’ll explore Assumptions of Regression Analysis in SPSS. You’ll learn how to test for normality, linearity, homoscedasticity, and multicollinearity to ensure your regression models are valid.