8th February 2013 at 6:18 am #1732
I have received mixed information regarding the need to screen for univariate outliers when conducting a multiple regression analysis using data obtained from Likert scales.
Can someone help clarify whether I need to screen for univariate outliers in this case?
Geoff8th February 2013 at 5:19 pm #1736Dave CollingridgeParticipant
A value that appears as an outlier in a simple scatter plot with a predictor variable on the x axis and the outcome variable on the y axis may not be a problem in a multiple regression. What really matters when performing a multiple regression are the residuals (difference between predicted and true scores). You could check for residual outliers with standardized residuals (z-scores). About 95% of those values should be less than 2.0. You could also create scatterplots with standardized residuals (z-scores) on the y axis and a standardized predicted variable on the x-axis. This plot allows you to visual check for residual outliers, linearity, and heteroschedasticity.11th February 2013 at 1:49 am #1735
Great, thanks for that Dave!
Geoff13th June 2013 at 6:57 am #1734
Hi Dave, hoping you might know the answer to this:
I am wanting to conduct several regression analyses – I am unsure what the procedure should be when doing this regarding data screening. This is mainly in concern with my sample; do I report my overall sample regardless of data screening and then, remove cases as needed for each model conducted? The issue with this is that the models would not really reflect the sample that was reported. Or do I keep screening until I am left with a sample that fits all regression models (i.e., no outliers etc that need to be deleted).21st June 2013 at 3:40 pm #1733Dave CollingridgeParticipant
I would begin by checking the residual plots and normal probability plots for each. If you are just looking for relationships while controlling for other variables, this should suffice. If you want to develop predictive models you should also check multicollinearity and other regression diagnostics.
If you remove outliers you should provide a rationale for doing so (e.g., data entry error, an unusual case that does not fit your original sample or population well, etc). Don’t just drop data because you don’t like it. You may also consider running data transformations to bring outliers within acceptable parameters.
I see no need to report results before and after removing outliers unless it provides useful information.
- You must be logged in to reply to this topic.