Normality Assumption Violated in Multiple Regression

Home Forums Default Forum Normality Assumption Violated in Multiple Regression

Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
  • #4689

    I run the normality test ie KS test and found that two DV and one IV are not normally distributed…..some one suggest me to transform the DVs only to normal distribution using Box-Cox conversion (present in stata)…I am only familiar with SPSS….I’ll be grtaeful if anyone can suggest how to tranform the abnormal distribution to normal in SPSS…


    Many thanks in advance…..


    Hi Shazia,

    First of all, make sure you’re testing the normality of residuals, not the DV. The IVs have no normality assumptions.

    Second, the KS test is over-sensitive for regression. You are better off with a normal probability plot (QQ plot–its in descriptives in later versions of SPSS and in graphs in earlier).

    And what Box-Cox will tell you is which transformation to do–square root, inverse, log, etc. You can just try a few if you can’t do a Box-Cox. Also, I could advise better if I knew how it’s non-normal. Transformations only work on skew.

    Here are some blog posts that you might find helpful:



    Thank you so much karen…i’ll follow your suggestions and see what i come up with….


    Note that recent reseach has shown that the normality-assumption for the residuals are unimportant in OLS. Whatever the distribution, you almost always get the best linear unbiased estimate (BLUE).

    I believe that you should pay more attention to the assumption of the x’es being uncorrelated with the error term (the conditional independence or unconfoundedness assumption).

    Note: If the residuals highly are non-normal, you may want to transform the dependent variable. Try af logarithm transformation (requires positive values). Box-cox is another solution.


    It depends upon the shape of the distribution. What does it look like?


    It’s a normal distribution…we used box cox to normalize the data…Thank you all for your replies

    Tammy Khabo

    Hi Everyone,

    I have a question which I think is about the same thing.

    I have a sample of 193 people, I’m also doing a multiple regression analysis and my 2 IV’s and 4 DV’s aren’t normally distributed. I did the KS test which was significant and Skewness and Kurtosis suggest the same thing. A log transformation did not manage to bring the variables to normality. I was wondering if it matters since its a regression analysis? If it doesn’t matter, I was wondering if any one is able to give me a reference I can cite? I have looked in Tabachnick and Tidell and I cannot seem to find anything on it.

    I did the residual checks using standardised resid and standardised predict. The residuals were linear, homogenous, normal and not correlated. Is that all that matters?


    Tammy Khabo

    I have now found something in Tabachnick and Fiddell that discusses it.


    A distribution that is not normal is not “abnormal.” It is simply a different distribution. Most studies of robustness of statistical methods have shown that linear regression is quite robust. One should be concerned, as you clearly are, about normality of the distribution of dependent variables, heteroscedasticity of the variances. If you can determine the nature of the distribution obtianed, you might consider using nonlinear regression. That is also available in SPSS. If your problem is only a case of “normalizing” the dependent variable, then use simple transoformations as square root or exponent. 


    I just came across this thread while I was facing a similar problem. I have a large data set (n=1700) with three dependent variables and lots of independent variables. I know from one of my supervisors who used structural equation modelling in his reasearch that for large sample sizes (N>1000) the assumption of normality is not that important as it doesnot introduce many biases in the results. my first question is would the same apply to regression? any references in this regards would be greatly appreciated…


    secondly, I do understand that only the residuals need to meet the requirement of normality. I tested mine and looked at the histograms and P-P plots as an output of linear regression. they don’t look that bad to me. however when I save the unstanderdized residuals and use the ‘explore’ command in spss to run the KS test, it fails to satisfy the assumption of normality. any thoughts?


    also which is a better visual test for normality. P-P plot of residuals or the Q-Q plot. can someone please explain why.



Viewing 10 posts - 1 through 10 (of 10 total)
  • The forum ‘Default Forum’ is closed to new topics and replies.