23rd July 2014 at 1:58 pm #935
I have run both pearson’s and spearman’s correlations on the same set of data (too late…no turning back), and I’m wondering if anybody can weigh in on how I might defend/explain using both methods. It is purely exploratory in nature, with no a priori hypotheses. In particular if somebody could comment on the following 2 situations….by the way, assume that the data is at least interval level or higher…and I’m looking at bivariate correlations between variables in a sample size of n = 18:
1) statistically significant (p<.01 level) pearson’s, but non-significant spearman’s
2) statistically significant (p<.01 level) spearman’s but non-significant pearson’s
Any input appreciated.
Cheers23rd July 2014 at 7:32 pm #954Dave CollingridgeParticipant
It is fine to run a Spearman correlation on continuous data if there is some doubt about whether the data are interval or ratio, or doubt about whether the distribution is bivariate normal and the relationship is linear. Pearson checks linear relationship while Spearman checks monotonic relationship which will be either increasing (i.e., as x increases, y never decreases) or decreasing (i.e., as x increases, y never increases). The results of a Pearson and Spearman are often similar. If the results of Pearson and Spearman differ then check assumptions. If they are not met then probably better off going with Spearman.24th July 2014 at 11:45 am #953
Agree with first reply.
But remember that correlations provide an ‘effect’ size giving you an estimate of the strength of the proportionate covariation. This is much more useful than significance (and easier to judge) even in the unlikely event you have a random sample (which I doubt). Work with strength not dichotomies.25th July 2014 at 12:10 am #952
Thanks to you both! Didn’t expect to have such distinguished academics answer my question.
That goes a long way actually in helping me wrap my head around understanding and explaining it.
My master’s defence is actually in 2 weeks, and so I want to be prepared from the statistics angle.
An added question of interest…Prof. Gorard, when you had mentioned having a random sample, did you mean a sample
that adequately reflects the population which I am investigating….or that it is unlikely to be random given that it is generally rare to collect a truly random sample?
Jacob25th July 2014 at 8:49 am #951
Well 18 cases is not many whatever the population is.
But I meant more the second. A random sample is a very simple but also a very precise thing. I have never had one in social science, nor seen or read about one. I can imagine one. For example, perhaps the state provides a full list of hospitals, you pick 18, and the data involved comes from existing material (their corporate documents perhaps) which is publicly available. There is no non-response. But in reality even when I have worked like that some documents are missing or do not contain the information required (and this is like non-response if we had to ask the hospitals for permission). Below is an extract from something I wrote:
A ‘random sample’ is a subset of cases from the known complete population, selected by chance in such a way as to be completely unpredictable. The chance element can be produced by specialist software or even Office, via a random number table, or a mechanical process like a card shuffling machine. These are strictly pseudo-random numbers because they are caused by a process of some kind, but they are all that is possible in reality, even if based on radio-active decay. A true random sample should clearly permit the occurrence of the same case more than once (like drawing a card from a pack of 52, replacing the card and drawing another at random). The chance of such repetition depends upon what proportion or fraction of the population is in the sample. If the population is large in relation to the sample this issue of sampling with replacement may not matter much in practice, but it is important to remember that sampling without replacement (of the card or whatever) is not really random sampling. It is important because compromises like this, however small, mount up. So far, we must accept that a ‘random’ sample is really pseudo-random, and that if it was drawn without replacement then it is slightly less random again (if there is such a thing).
More importantly, a random sample must be complete in two senses. It must include every case that was selected from the population by chance. There can be no non-response, refusal or dropout. And every case selected must have a known measurement or value of the characteristic (height, number of rooms or whatever) that is used to calculate the SE. There must be no missing values. As before, these issues are assumed in the definition and computation of the SE, and are a matter of mathematical necessity (they are, in fact, tautologies), not somehow a matter of opinion or evidence. Imagine the process of dealing a hand of 13 playing cards. If the cards are properly shuffled, we can say that the hand is random. If the deal involved picking 15 cards, then returning two low cards to the pack, the resulting hand is not random, even though it has 13 randomly dealt cards. Similarly, dealing 13 cards, then replacing two low cards by two others drawn randomly from the pack would not yield a random hand. This is so, even though all 13 cards in the eventual hand had been drawn randomly. A random sample, despite being an ideal, is a simple thing and any deviation from random sampling must lead to a sample that is not random.27th July 2014 at 6:18 pm #950
Ah I see.
Thanks very much for the clarification Prof Gorard.
I wonder if I can ask yet another question, with apologies in advance for your trouble…
Is there a practical way of dealing with (or getting around) the fact that a sample is not random?27th July 2014 at 11:19 pm #949
Yes. It’s easy to deal with. Just do not do any analysis predicated on having a random sample (and hence a standard error). So no confidence intervals, significance testing of any kind, multi-level modelling etc.
Stick to effect sizes – based on R for example in your question. And use your judgement. And explain your judgement. Distinguish clearly between statements about the cases in your study, and the much less viable statements about other cases (generalisation).
Gorard, S. (2013) Research Design: Robust approaches for the social sciences, London: SAGE, ISBN 978-1446249024, 218 pages28th July 2014 at 1:09 am #948AnonymousInactive
if we throw away multilevel models on the grounds of not having a random sample, why don’t we throw away OLS multiple regression as well? multiple regression is, after all, a restricted version of linear mixed effect models… yet you don’t seem to have any issues advocating for it in previous posts. and the logical follow-up from that would be, of course, to throw away the correlation coefficient since it it can also be conceived as an even more restricted version of the linear model.28th July 2014 at 7:48 am #947
Well, I am here to help those who ask (rather than argue with the mindless button-pushers). The underlying theme is the (estimated) standard error. Its definition depends entirely on a random sample (or allocation). Thus, with no random sample there can be no standard error (actually the estimated se is nonsense anyway even with a random sample but that is a different point). CIs, sig tests and so on are predicated on the SE. Correlation and linear, logistic and many other kinds of regression are not. Ignore the asterisks and the R is simply a measure of the strength of the (linear) relationship. It is an effect size. The rationale for the bizarre notion (because it only considers nested groups etc.) of multi-level modelling (or HLM initially) was that it was needed to overcome a problem with the SE. The claim was that unless we took account of the data structure then we would underestimate the SE and so obtain sig too easily. Without an SE, as where the sample is not random, the claim, such as it is, falls to the ground. Of course, there were always simpler ways to deal with clustering anyway, such as robust estimates or just raising the alpha level.
PS. A few notes might also help. MLM produces the same substantive different result in real-life as OLS anyway. People rarely have a clustered sample with even attempted randomisation at more than one level, and often use levels in MLM that have not been randomised, making the argument about deflated SEs clearly nonsense. People even use it with population data! Their defence is the ‘super-population’ by which time we are in la-la land.28th July 2014 at 9:39 am #946AnonymousInactive
the issue i took with your criticism is that MLM was not only devised to handle the shrinkge of the SEs. as you correctly pointed out, there are myriads of ways (including something as simple as conveniently-specified dummy codes, which is favoured by econometricians) to take care of that. but none of them allow one to model different sources of variance due to nesting… and if we have to be precise enough (which is why i refer to MLM as linear mixed models), they also correct the fixed effects for *crossed* random effects (so no need for ‘levels’ of any type), even though this particular design is not usually explored in the social sciences, but speaks to the fact that nesting is not the only data structure that can be modeled. i do realize that people like to sell linear mixed effects on the grounds that it corrects for type 1 error rate and that’s it. but they do a lot more than that, if used properly. plus they allow one to use the likelihood-based criteria for model comparison which is a lot more sensible, in my opinion, than the standard chi-squares test of fit.28th July 2014 at 10:03 am #945
As I said, my purpose in interacting on this site is to assist those who are seeking help. I am not trying to prevent you using MLM any more than I would try to stop a religious person believing in miracles or a child in fairies. It’s not worth the grief.
But these are facts. If you read the original 1980s papers on MLM and HLM it was the SE that concerned them. The purported modelling advantages came post hoc (a cynic would say in order to boost sales of MLWin). And these advantages can also be obtained more easily, and can include non-nested groupings (such as sex, class or ethnicity) which abound in social science but have to become co-variates or something in MLM. More specifically – you talk of ‘likelihood’ and chi-square (as a worse alternative). Both of these are predicated absolutely on randomisation (as a mathematical necessity). We are talking about the situation that J has (of no randomisation). There are no likelihoods! Try not to mislead please.28th July 2014 at 5:33 pm #944AnonymousInactive
i’d also like to ask you then not to mislead as well.
likelihood-based information approaches (much like Bayesian analysis) does not necessitate of random sampling for they rely on a different conceptualization of probability that what is defended by the frequentists. likelihoods ARE NOT probabilities and depend on a different set of assumptions than randomization. the Kullback–Leibler divergence measure, for instance, in no way necessitates of random sample to be valid and its rather easy to calculate. it does depend, however, on some of the asymptotic properties of the estimator and if these hold there is no problem in using it.
i can see that you’re probably only familiar with the approach to linear mixed models (hence using its social-science specific names MLM/HLM) as explained by Raudenbush, Bryk, Goldstein, etc. which do emphasize correct standard errors over anything else. but if you look at the work of the original statisticians who helped develop these models (Lairde, Ware, Searle, Casella, etc.) you will see that correct standard errors is more a fortunate by-product rather than the primary reason of why they exist. i showed to you in the previous thread where we debated this that they include a whole new term specifically to account for different sources of variance. that ‘Zu’ term from the other thread, designed to capture the random effects, is the main purpose for which they exist. if this was hijacked by other authors to change the emphasis of the purpose of these models that’s not an issue of the modeling technique but of the people who did so.
honestly, i don’t mind if you wish to crusade against traditional null hypothesis testing, be my guest. i do mind, however, when you just throw under the bus families of techniques and useful methods that have proven to be useful in other areas of science just because you don’t agree with some aspects of them.28th July 2014 at 5:54 pm #943
If you want to continue perhaps contact me directly away from this site. None of this is helping J.
The chi-squared test you cited as an alternative IS a test of significance. The KL measure you present IS based on probability distributions. As is the first paper by Laird (not ‘Lairde’). These are facts and easily checked.
As I said, I am not seeking to persuade you – but to assist those like J with basic questions. My answer stands, by mathematical definition. If you do not have a true random sample do not use any form of analysis computed using randomisation, probability distributions, standard errors and so on.28th July 2014 at 7:23 pm #942AnonymousInactive
i believe it is important to keep this kind of discussions out in the open so that not only J but everyone else who wishes to look at it can be exposed to different points of view.
now, on my 2nd post i did not advocate for the chi-square test of fit. i mentioned that information-based approaches derived from the likelihood (like entropy or the KL divergence measure) are much sensible than the chi-square test of fit. information criteria in no way assume a random sample, but the comparison of a model as posed by a probability distribution. do they rely on probability distributions? absolutely. does that imply they’re also null hypothesis tests with an associated p-value? absolutely not. to my knowledge, there is no known sampling distribution of any information criteria or entropy-based measure and they’re not compared on the bases of statistically significant differences, but rather on the assumption of how much “information” the sample contains.
sorry about the extra ‘e’, but i’m not referring at the significance testing approach for the fixed or random effects in linear mixed effects models. the advantages i’m citing is the ability to create more complete models that are more accurate representations of reality. even if you ignore the process of significance testing, you can still end up with more information in order to describe your sample more effectively, although we’ve been down this road before already on the previous thread and i did keep on wondering why a more flexible method should be ignored simply because one is not convinced about one of the assumptions. you still end up with measures of variability that could be very helpful when talking about aspects like reliability or change over time.29th July 2014 at 12:23 am #941AnonymousInactive
after reading more carefully your reply to J, i think we both are trying to achieve the same thing but from different perspectives.
neither of us is completely satisfied with the way data analysis is being carried out today. the approach that you seem to favour (and please correct me if i’ve misunderstood you) is one where statistics and other effect-size type measures are mostly interpreted just within the framework of the study at hand.
the approach i’m advocating is changing the current perspective of probability from the frequentist/null hypothesis testing approach in the Neyman-Pearson tradition to something more along the lines of Fisher’s (failed) fiducial inference/”likelihoodism” and, ultimately, Bayesian analysis. mostly because within the approach you suggest i don’t see room to say something about the uncertainty or the variability of the numbers we’re calculating (again, i may have missed where you expand on this so i may need to correct that).
we’re living through the “replicability crisis” in Psychology prompted by Diedrik Stapel’s fraud and its ripples are beginning to expand towards other social sciences. part of the solution that is being championed is the use of stricter, more sophisticated statistical controls, not less of them. this is why i try and present the idea that measuring the uncertainty of our data is doable, we just need to change the framework of how it is being done.
- You must be logged in to reply to this topic.