Home › Forums › Default Forum › Normality Assumption Violated in Multiple Regression
- This topic has 9 replies, 7 voices, and was last updated 9 years, 4 months ago by
Karen Grace-Martin.
-
AuthorPosts
-
28th April 2010 at 3:21 am #4689
Shazia Nauman
MemberI run the normality test ie KS test and found that two DV and one IV are not normally distributed…..some one suggest me to transform the DVs only to normal distribution using Box-Cox conversion (present in stata)…I am only familiar with SPSS….I’ll be grtaeful if anyone can suggest how to tranform the abnormal distribution to normal in SPSS…
Many thanks in advance…..
28th April 2010 at 3:10 pm #4698Karen Grace-Martin
MemberHi Shazia,
First of all, make sure you’re testing the normality of residuals, not the DV. The IVs have no normality assumptions.
Second, the KS test is over-sensitive for regression. You are better off with a normal probability plot (QQ plot–its in descriptives in later versions of SPSS and in graphs in earlier).
And what Box-Cox will tell you is which transformation to do–square root, inverse, log, etc. You can just try a few if you can’t do a Box-Cox. Also, I could advise better if I knew how it’s non-normal. Transformations only work on skew.
Here are some blog posts that you might find helpful:
http://www.analysisfactor.com/statchat/?p=486
http://www.analysisfactor.com/statchat/?p=688
http://www.analysisfactor.com/statchat/?p=858Karen
2nd May 2010 at 9:48 am #4697Shazia Nauman
MemberThank you so much karen…i’ll follow your suggestions and see what i come up with….
4th May 2010 at 8:00 am #4696Kristian Karlson
MemberNote that recent reseach has shown that the normality-assumption for the residuals are unimportant in OLS. Whatever the distribution, you almost always get the best linear unbiased estimate (BLUE).
I believe that you should pay more attention to the assumption of the x’es being uncorrelated with the error term (the conditional independence or unconfoundedness assumption).
Note: If the residuals highly are non-normal, you may want to transform the dependent variable. Try af logarithm transformation (requires positive values). Box-cox is another solution.
7th June 2010 at 3:38 am #4695Reggie Taylor, Ph.D.
MemberIt depends upon the shape of the distribution. What does it look like?
10th June 2010 at 5:31 am #4694Shazia Nauman
MemberIt’s a normal distribution…we used box cox to normalize the data…Thank you all for your replies
13th August 2010 at 6:30 pm #4693Tammy Khabo
MemberHi Everyone,
I have a question which I think is about the same thing.
I have a sample of 193 people, I’m also doing a multiple regression analysis and my 2 IV’s and 4 DV’s aren’t normally distributed. I did the KS test which was significant and Skewness and Kurtosis suggest the same thing. A log transformation did not manage to bring the variables to normality. I was wondering if it matters since its a regression analysis? If it doesn’t matter, I was wondering if any one is able to give me a reference I can cite? I have looked in Tabachnick and Tidell and I cannot seem to find anything on it.
I did the residual checks using standardised resid and standardised predict. The residuals were linear, homogenous, normal and not correlated. Is that all that matters?
Tammy
13th August 2010 at 7:32 pm #4692Tammy Khabo
MemberI have now found something in Tabachnick and Fiddell that discusses it.
7th March 2011 at 7:15 pm #4691Joseph Lovett
MemberA distribution that is not normal is not “abnormal.” It is simply a different distribution. Most studies of robustness of statistical methods have shown that linear regression is quite robust. One should be concerned, as you clearly are, about normality of the distribution of dependent variables, heteroscedasticity of the variances. If you can determine the nature of the distribution obtianed, you might consider using nonlinear regression. That is also available in SPSS. If your problem is only a case of “normalizing” the dependent variable, then use simple transoformations as square root or exponent.
30th September 2011 at 11:47 am #4690ssamdani
MemberI just came across this thread while I was facing a similar problem. I have a large data set (n=1700) with three dependent variables and lots of independent variables. I know from one of my supervisors who used structural equation modelling in his reasearch that for large sample sizes (N>1000) the assumption of normality is not that important as it doesnot introduce many biases in the results. my first question is would the same apply to regression? any references in this regards would be greatly appreciated…
secondly, I do understand that only the residuals need to meet the requirement of normality. I tested mine and looked at the histograms and P-P plots as an output of linear regression. they don’t look that bad to me. however when I save the unstanderdized residuals and use the ‘explore’ command in spss to run the KS test, it fails to satisfy the assumption of normality. any thoughts?
also which is a better visual test for normality. P-P plot of residuals or the Q-Q plot. can someone please explain why.
Sarah
-
AuthorPosts
- The forum ‘Default Forum’ is closed to new topics and replies.