Home › Forums › Methodspace discussion › Pearson and Spearman’s correlations
 This topic has 19 replies, 5 voices, and was last updated 6 years, 7 months ago by Dave Collingridge.

AuthorPosts

23rd July 2014 at 1:58 pm #935JMember
I have run both pearson’s and spearman’s correlations on the same set of data (too late…no turning back), and I’m wondering if anybody can weigh in on how I might defend/explain using both methods. It is purely exploratory in nature, with no a priori hypotheses. In particular if somebody could comment on the following 2 situations….by the way, assume that the data is at least interval level or higher…and I’m looking at bivariate correlations between variables in a sample size of n = 18:
1) statistically significant (p<.01 level) pearson’s, but nonsignificant spearman’s
2) statistically significant (p<.01 level) spearman’s but nonsignificant pearson’s
Any input appreciated.
Cheers
23rd July 2014 at 7:32 pm #954Dave CollingridgeParticipantIt is fine to run a Spearman correlation on continuous data if there is some doubt about whether the data are interval or ratio, or doubt about whether the distribution is bivariate normal and the relationship is linear. Pearson checks linear relationship while Spearman checks monotonic relationship which will be either increasing (i.e., as x increases, y never decreases) or decreasing (i.e., as x increases, y never increases). The results of a Pearson and Spearman are often similar. If the results of Pearson and Spearman differ then check assumptions. If they are not met then probably better off going with Spearman.
24th July 2014 at 11:45 am #953Stephen GorardParticipantAgree with first reply.
But remember that correlations provide an ‘effect’ size giving you an estimate of the strength of the proportionate covariation. This is much more useful than significance (and easier to judge) even in the unlikely event you have a random sample (which I doubt). Work with strength not dichotomies.
25th July 2014 at 12:10 am #952JMemberThanks to you both! Didn’t expect to have such distinguished academics answer my question.
That goes a long way actually in helping me wrap my head around understanding and explaining it.
My master’s defence is actually in 2 weeks, and so I want to be prepared from the statistics angle.
An added question of interest…Prof. Gorard, when you had mentioned having a random sample, did you mean a sample
that adequately reflects the population which I am investigating….or that it is unlikely to be random given that it is generally rare to collect a truly random sample?
Best…Cheers!
Jacob
25th July 2014 at 8:49 am #951Stephen GorardParticipantWell 18 cases is not many whatever the population is.
But I meant more the second. A random sample is a very simple but also a very precise thing. I have never had one in social science, nor seen or read about one. I can imagine one. For example, perhaps the state provides a full list of hospitals, you pick 18, and the data involved comes from existing material (their corporate documents perhaps) which is publicly available. There is no nonresponse. But in reality even when I have worked like that some documents are missing or do not contain the information required (and this is like nonresponse if we had to ask the hospitals for permission). Below is an extract from something I wrote:
A ‘random sample’ is a subset of cases from the known complete population, selected by chance in such a way as to be completely unpredictable. The chance element can be produced by specialist software or even Office, via a random number table, or a mechanical process like a card shuffling machine. These are strictly pseudorandom numbers because they are caused by a process of some kind, but they are all that is possible in reality, even if based on radioactive decay. A true random sample should clearly permit the occurrence of the same case more than once (like drawing a card from a pack of 52, replacing the card and drawing another at random). The chance of such repetition depends upon what proportion or fraction of the population is in the sample. If the population is large in relation to the sample this issue of sampling with replacement may not matter much in practice, but it is important to remember that sampling without replacement (of the card or whatever) is not really random sampling. It is important because compromises like this, however small, mount up. So far, we must accept that a ‘random’ sample is really pseudorandom, and that if it was drawn without replacement then it is slightly less random again (if there is such a thing).
More importantly, a random sample must be complete in two senses. It must include every case that was selected from the population by chance. There can be no nonresponse, refusal or dropout. And every case selected must have a known measurement or value of the characteristic (height, number of rooms or whatever) that is used to calculate the SE. There must be no missing values. As before, these issues are assumed in the definition and computation of the SE, and are a matter of mathematical necessity (they are, in fact, tautologies), not somehow a matter of opinion or evidence. Imagine the process of dealing a hand of 13 playing cards. If the cards are properly shuffled, we can say that the hand is random. If the deal involved picking 15 cards, then returning two low cards to the pack, the resulting hand is not random, even though it has 13 randomly dealt cards. Similarly, dealing 13 cards, then replacing two low cards by two others drawn randomly from the pack would not yield a random hand. This is so, even though all 13 cards in the eventual hand had been drawn randomly. A random sample, despite being an ideal, is a simple thing and any deviation from random sampling must lead to a sample that is not random.
27th July 2014 at 6:18 pm #950JMemberAh I see.
Thanks very much for the clarification Prof Gorard.
I wonder if I can ask yet another question, with apologies in advance for your trouble…
Is there a practical way of dealing with (or getting around) the fact that a sample is not random?
27th July 2014 at 11:19 pm #949Stephen GorardParticipantYes. It’s easy to deal with. Just do not do any analysis predicated on having a random sample (and hence a standard error). So no confidence intervals, significance testing of any kind, multilevel modelling etc.
Stick to effect sizes – based on R for example in your question. And use your judgement. And explain your judgement. Distinguish clearly between statements about the cases in your study, and the much less viable statements about other cases (generalisation).
Try reading:
Gorard, S. (2013) Research Design: Robust approaches for the social sciences, London: SAGE, ISBN 9781446249024, 218 pages
28th July 2014 at 1:09 am #948AnonymousInactiveif we throw away multilevel models on the grounds of not having a random sample, why don’t we throw away OLS multiple regression as well? multiple regression is, after all, a restricted version of linear mixed effect models… yet you don’t seem to have any issues advocating for it in previous posts. and the logical followup from that would be, of course, to throw away the correlation coefficient since it it can also be conceived as an even more restricted version of the linear model.
28th July 2014 at 7:48 am #947Stephen GorardParticipantWell, I am here to help those who ask (rather than argue with the mindless buttonpushers). The underlying theme is the (estimated) standard error. Its definition depends entirely on a random sample (or allocation). Thus, with no random sample there can be no standard error (actually the estimated se is nonsense anyway even with a random sample but that is a different point). CIs, sig tests and so on are predicated on the SE. Correlation and linear, logistic and many other kinds of regression are not. Ignore the asterisks and the R is simply a measure of the strength of the (linear) relationship. It is an effect size. The rationale for the bizarre notion (because it only considers nested groups etc.) of multilevel modelling (or HLM initially) was that it was needed to overcome a problem with the SE. The claim was that unless we took account of the data structure then we would underestimate the SE and so obtain sig too easily. Without an SE, as where the sample is not random, the claim, such as it is, falls to the ground. Of course, there were always simpler ways to deal with clustering anyway, such as robust estimates or just raising the alpha level.
PS. A few notes might also help. MLM produces the same substantive different result in reallife as OLS anyway. People rarely have a clustered sample with even attempted randomisation at more than one level, and often use levels in MLM that have not been randomised, making the argument about deflated SEs clearly nonsense. People even use it with population data! Their defence is the ‘superpopulation’ by which time we are in lala land.
28th July 2014 at 9:39 am #946AnonymousInactivethe issue i took with your criticism is that MLM was not only devised to handle the shrinkge of the SEs. as you correctly pointed out, there are myriads of ways (including something as simple as convenientlyspecified dummy codes, which is favoured by econometricians) to take care of that. but none of them allow one to model different sources of variance due to nesting… and if we have to be precise enough (which is why i refer to MLM as linear mixed models), they also correct the fixed effects for *crossed* random effects (so no need for ‘levels’ of any type), even though this particular design is not usually explored in the social sciences, but speaks to the fact that nesting is not the only data structure that can be modeled. i do realize that people like to sell linear mixed effects on the grounds that it corrects for type 1 error rate and that’s it. but they do a lot more than that, if used properly. plus they allow one to use the likelihoodbased criteria for model comparison which is a lot more sensible, in my opinion, than the standard chisquares test of fit.
28th July 2014 at 10:03 am #945Stephen GorardParticipantAs I said, my purpose in interacting on this site is to assist those who are seeking help. I am not trying to prevent you using MLM any more than I would try to stop a religious person believing in miracles or a child in fairies. It’s not worth the grief.
But these are facts. If you read the original 1980s papers on MLM and HLM it was the SE that concerned them. The purported modelling advantages came post hoc (a cynic would say in order to boost sales of MLWin). And these advantages can also be obtained more easily, and can include nonnested groupings (such as sex, class or ethnicity) which abound in social science but have to become covariates or something in MLM. More specifically – you talk of ‘likelihood’ and chisquare (as a worse alternative). Both of these are predicated absolutely on randomisation (as a mathematical necessity). We are talking about the situation that J has (of no randomisation). There are no likelihoods! Try not to mislead please.
28th July 2014 at 5:33 pm #944AnonymousInactivei’d also like to ask you then not to mislead as well.
likelihoodbased information approaches (much like Bayesian analysis) does not necessitate of random sampling for they rely on a different conceptualization of probability that what is defended by the frequentists. likelihoods ARE NOT probabilities and depend on a different set of assumptions than randomization. the Kullback–Leibler divergence measure, for instance, in no way necessitates of random sample to be valid and its rather easy to calculate. it does depend, however, on some of the asymptotic properties of the estimator and if these hold there is no problem in using it.
i can see that you’re probably only familiar with the approach to linear mixed models (hence using its socialscience specific names MLM/HLM) as explained by Raudenbush, Bryk, Goldstein, etc. which do emphasize correct standard errors over anything else. but if you look at the work of the original statisticians who helped develop these models (Lairde, Ware, Searle, Casella, etc.) you will see that correct standard errors is more a fortunate byproduct rather than the primary reason of why they exist. i showed to you in the previous thread where we debated this that they include a whole new term specifically to account for different sources of variance. that ‘Zu’ term from the other thread, designed to capture the random effects, is the main purpose for which they exist. if this was hijacked by other authors to change the emphasis of the purpose of these models that’s not an issue of the modeling technique but of the people who did so.
honestly, i don’t mind if you wish to crusade against traditional null hypothesis testing, be my guest. i do mind, however, when you just throw under the bus families of techniques and useful methods that have proven to be useful in other areas of science just because you don’t agree with some aspects of them.
28th July 2014 at 5:54 pm #943Stephen GorardParticipantIf you want to continue perhaps contact me directly away from this site. None of this is helping J.
The chisquared test you cited as an alternative IS a test of significance. The KL measure you present IS based on probability distributions. As is the first paper by Laird (not ‘Lairde’). These are facts and easily checked.
As I said, I am not seeking to persuade you – but to assist those like J with basic questions. My answer stands, by mathematical definition. If you do not have a true random sample do not use any form of analysis computed using randomisation, probability distributions, standard errors and so on.
28th July 2014 at 7:23 pm #942AnonymousInactivei believe it is important to keep this kind of discussions out in the open so that not only J but everyone else who wishes to look at it can be exposed to different points of view.
now, on my 2nd post i did not advocate for the chisquare test of fit. i mentioned that informationbased approaches derived from the likelihood (like entropy or the KL divergence measure) are much sensible than the chisquare test of fit. information criteria in no way assume a random sample, but the comparison of a model as posed by a probability distribution. do they rely on probability distributions? absolutely. does that imply they’re also null hypothesis tests with an associated pvalue? absolutely not. to my knowledge, there is no known sampling distribution of any information criteria or entropybased measure and they’re not compared on the bases of statistically significant differences, but rather on the assumption of how much “information” the sample contains.
sorry about the extra ‘e’, but i’m not referring at the significance testing approach for the fixed or random effects in linear mixed effects models. the advantages i’m citing is the ability to create more complete models that are more accurate representations of reality. even if you ignore the process of significance testing, you can still end up with more information in order to describe your sample more effectively, although we’ve been down this road before already on the previous thread and i did keep on wondering why a more flexible method should be ignored simply because one is not convinced about one of the assumptions. you still end up with measures of variability that could be very helpful when talking about aspects like reliability or change over time.
29th July 2014 at 12:23 am #941AnonymousInactiveafter reading more carefully your reply to J, i think we both are trying to achieve the same thing but from different perspectives.
neither of us is completely satisfied with the way data analysis is being carried out today. the approach that you seem to favour (and please correct me if i’ve misunderstood you) is one where statistics and other effectsize type measures are mostly interpreted just within the framework of the study at hand.
the approach i’m advocating is changing the current perspective of probability from the frequentist/null hypothesis testing approach in the NeymanPearson tradition to something more along the lines of Fisher’s (failed) fiducial inference/”likelihoodism” and, ultimately, Bayesian analysis. mostly because within the approach you suggest i don’t see room to say something about the uncertainty or the variability of the numbers we’re calculating (again, i may have missed where you expand on this so i may need to correct that).
we’re living through the “replicability crisis” in Psychology prompted by Diedrik Stapel’s fraud and its ripples are beginning to expand towards other social sciences. part of the solution that is being championed is the use of stricter, more sophisticated statistical controls, not less of them. this is why i try and present the idea that measuring the uncertainty of our data is doable, we just need to change the framework of how it is being done.

AuthorPosts
 You must be logged in to reply to this topic.