I am confused about the difference between R square and adjusted R square. I’ve 11% R square and 2% adjusted R square of my model. I’ve one dependent and two independent variables in my model. How I can interpret and differentiate R square and adjusted R square for my model?
The problem with R square is that it’s biased upwards as a population estimate. Imagine that the true (population) correlation is exactly zero. Now sample data from that population. You won’t get a correlation of zero – you’ll sometimes get a correlation that’s positive (too high) and sometimes negative (too low). However, if you were to sample lots of times, on average you would get a correlation of zero. So the correlation is an unbiased estimate of the population correlation.
If the population correlation is 0, then the population R-squared is also zero. But what happens when we sample and get a positive correlation? We get a positive R-squared. And if we sample and get a negative correlation, we square that to get R-squared. So, on average, the R-squared is positive when the true R-squared is zero. Hence R-squared is biased.
To remove this bias, you can use adjusted R-squared. This makes R-squared smaller, and the effect is larger when you have more variables, and smaller when you have fewer people. If you only have two predictor variables, and R-squared is going down that much , you don’t have many predictors.
Use R-squared for R-squared. Don’t worry about adjusted R-squared, unless you’re going to do precise prediction.