Thursday, March 12, 2009

Don't standardize interaction/moderator effects in multiple regression

This is another quick blog entry related to a query I had today.

In my moderated multiple regression workshop a while back I wrote (on slide 15) "don't use standardized regression coefficients". In the talk itself I briefly mentioned why and directed people to the relevant section of the (excellent) explanation by Kris Preacher:


I'd like to clarify why standardization is a particularly bad idea in this case. My dislike of 'standardization' is fairly well known, and it goes without saying (I hope) that one reason not to use standardized regression coefficient relates to this. I hope to write about this in detail soon (but check out my BJP article on effect size for further details).

However, there are additional reasons why standardizing predictors will cause trouble in moderated  multiple regression. Standardization involves centering the predictors and scaling them terms of their sample SDs. Centering is very often a useful thing to do in moderated multiple regression. However,  statistics packages such as SPSS will standardize all the predictors - including the product terms -  in moderated multiple regression. This is because they have no way of knowing that the product term is not a 'regular' predictor (similarly if anyone were foolish enough to do a stepwise regression ... the software would not know to keep in X1 and X2 for each X1.X2 product term). This means that the X1.X2 product term will be standardized along with X1 and X2 rather than being computed (correctly) as the product of the two standardized predictors (i.e., Zx1 and Zx2 multiplied together).

That's clearly a problem (the t test should be OK but the value of the coefficient and simple slopes will be wrong). However, the clincher is that even if the correct standardization is carried out (e.g., computing the standardized predictors yourself and then taking the product of the relevant standardized predictors and entering it into the regression) the standard errors will be incorrect (which is problematic for constructing confidence and prediction intervals).

In summary, you can't rely on the software to get the standardization correct so use the unstandardized regression coefficients and standard errors! (Also, did I mention that standardized coefficients are generally a bad idea anyway?).




No comments:

Post a Comment