Day 13: Multiple Regression in SPSS – Predicting with Multiple Variables

Day 13: Multiple Regression in SPSS – Predicting with Multiple Variables

Welcome to Day 13 of your 50-day SPSS learning journey! Today, we’ll explore Multiple Regression, a powerful statistical technique for predicting a dependent variable based on two or more independent variables. This method allows you to build more comprehensive models by incorporating multiple predictors.


What is Multiple Regression?

Multiple Regression extends Simple Linear Regression by including multiple independent variables. The relationship is expressed by the equation:

Y = a + b₁X₁ + b₂X₂ + b₃X₃ + … + e

Where:

  • Y = Dependent variable (outcome).
  • X₁, X₂, X₃ = Independent variables (predictors).
  • a = Intercept (value of Y when all X’s are 0).
  • b₁, b₂, b₃ = Regression coefficients (impact of each predictor on Y).
  • e = Error term (unexplained variance).

When to Use Multiple Regression?

Use Multiple Regression when:

  1. You want to predict an outcome using multiple factors.
  2. All variables are continuous or dichotomous (binary).
  3. You’ve verified relationships between predictors and the dependent variable.

How to Perform Multiple Regression in SPSS

Step 1: Open Your Dataset

For this example, use the following dataset:

ID Hours_Studied Attendance Test_Score
1 2 60 50
2 4 70 60
3 6 80 70
4 8 90 80
5 10 95 90
  • Test_Score: Dependent variable (Y).
  • Hours_Studied and Attendance: Independent variables (X₁, X₂).

Step 2: Access the Regression Tool

  1. Go to Analyze > Regression > Linear.
  2. A dialog box will appear.

Step 3: Select Variables

  1. Move Test_Score to the Dependent box.
  2. Move Hours_Studied and Attendance to the Independent(s) box.

Step 4: Customize Options

  1. Click Statistics and check:
    • Estimates: To see regression coefficients.
    • Model Fit: To view R-squared and ANOVA.
    • Collinearity Diagnostics: To check for multicollinearity (if predictors are highly correlated).
  2. Click OK to run the analysis.

Interpreting the Output

The SPSS output includes the following key sections:

1. Model Summary Table

  • R-Square: Proportion of variance in the dependent variable explained by the predictors.
    • Example: If R-Square = 0.90, 90% of the variation in Test_Score is explained by Hours_Studied and Attendance.
  • Adjusted R-Square: Adjusted for the number of predictors in the model.

2. ANOVA Table

  • Tests whether the regression model is statistically significant.
  • Look at the Sig. value: If p < 0.05, the model is significant.

3. Coefficients Table

  • Shows the regression equation:
    • Unstandardized Coefficients (B): Used to construct the equation.
    • Example:
      Test_Score = 10 + 5 * Hours_Studied + 0.3 * Attendance
      • For every additional hour studied, the test score increases by 5 points.
      • For every 1% increase in attendance, the test score increases by 0.3 points.
  • Standardized Coefficients (Beta): Compare the relative importance of predictors.

4. Collinearity Diagnostics

  • Variance Inflation Factor (VIF): If VIF > 10, multicollinearity may be an issue.

Practice Example: Build a Multiple Regression Model

Use the following dataset:

ID Income Education_Level Work_Experience Satisfaction
1 30000 16 2 7
2 40000 18 4 8
3 50000 20 6 9
4 60000 18 8 8
5 70000 16 10 9
  1. Perform a Multiple Regression with Satisfaction as the dependent variable.
  2. Use Income, Education_Level, and Work_Experience as predictors.
  3. Interpret the R-squared value, coefficients, and significance levels.

Common Mistakes to Avoid

  1. Ignoring Multicollinearity: If predictors are highly correlated, it can distort the model. Check VIF values in the output.
  2. Overfitting: Adding too many predictors can lead to overfitting. Use Adjusted R-Square to evaluate model performance.
  3. Assuming Causation: Regression shows relationships, not causation. Consider the context of your data.

Key Takeaways

  • Multiple Regression predicts a dependent variable using multiple predictors.
  • Use R-squared, p-values, and coefficients to assess model fit and interpret relationships.
  • Always check for multicollinearity to ensure your model is reliable.

What’s Next?

In Day 14 of your 50-day SPSS learning journey, we’ll explore Assumptions of Regression Analysis in SPSS. You’ll learn how to test for normality, linearity, homoscedasticity, and multicollinearity to ensure your regression models are valid.