Mixed Methods researchers

McNemar/Bowker references

Viewing 12 posts - 1 through 12 (of 12 total)
  • Author
    Posts
  • #831
    Tessa Mearns
    Member

    Following a lot of searching through books and journals to find an alternative to Chi squared for in-sample testing, I eventually resorted to the internet and found a lot of recommendations for McNemar or Bowker. I ended up using these (well, Bowker, as I always have more than 2 variables), and this seemed to work well.

    My problem now is that I don’t have a decent reference to defend my use of this test in my thesis. I can’t find either test in any of my books, and searching for papers hasn’t done much good either. From the information I’ve found, I’m fairly certain I’ve chosen the right test, but I don’t think that the sources I’ve used would look great in my references.

    I tried writing to my stats contact at my university (I’m a distance student) but she took 3 months to reply and then didn’t even attempt to answer my questions about my choice of test, so I’m kind of on my own with this. Does anyone know of a book or article that would justify or even just discuss the use of these tests?

    My field is education and my research is mixed methods. I’m using the test to measure the significance of differences between responses from the same group regarding different school subjects, and to compare responses from the same respondents during two data collection periods. The data are mostly Likert scale, which I am treating as nominal rather than scale. I am using SPSS and have used Chi squared elsewhere to compare two different groups of respondents.

    #842
    Stephen Gorard
    Participant

    Have not heard of this much, You can try – Computational Statistics & Data Analysis51, 9, pp.4124–4142

    But please do not do this. It does not make sense (why on earth would you want the probability that this ‘test’ generates?). Just convert the difference to ‘effect’ sizes for between subject or between episodes comparisons. 

    #841
    Tessa Mearns
    Member

    Thanks for this very quick response. I had actually found this article too (it was the only one I found!) but I am interested in the rest of your comment.

    Could you explain a bit more of what you mean about converting effect sizes? I’m completely self-taught when it comes to stats and SPSS so there may be quite a few gaps in my knowledge that I’m not aware of. Often it’s just a question of terminology.

    Thanks again,

    Tessa

    #840
    Stephen Gorard
    Participant

    Well, what is it you are trying to decide here? You have your ordinal responses for each subject and at each time period. You can look at them as scatters, or compute the percentage at or above a certain response level in each group, or a range of options. You could present V or more usefully observed-expected. 

    Take the percentage option. One subject group either has a higher percentage than the other or not. The difference in points is the result. What else would you need to know? Or put another way, what would chi-squared or any other sig test tell you additionally? 

    #839
    Tessa Mearns
    Member

    That was my original approach but one of my supervisors told me that wasn’t good enough. He wanted me to use t-tests and treat the data as continuous but I was dead against that. Chi squared was my compromise, but it can’t be used for comparisons within one group of respondents. I found lots of recommendations for McNemar-Bowker online, but nothing in the actual literature.

    I have found it quite useful to use these tests as I have a lot of data from a large sample. I identified the general trends based on percentages at first and then went back to see what Chi squared showed to be significant or not in order to narrow my findings slightly. Mostly, the differences I had identified were statistically significant, and if they weren’t it usually just confirmed by own doubts about whether I should include them. I’ve also kept in a couple of differences that showed up in the percentages and weren’t significant according to Chi squared, so I wasn’t completely reliant on tests.

    Does that make any sense as a justification?

    I like the idea of using scatters or looking at percentages in the way you suggest. I hadn’t thought of either of those so will need to look into them a bit more.

    #838
    Stephen Gorard
    Participant

    Ok good. But mainly misses my point. Whether you used t, chi or something more obscure they all try to generate a probability. Why do you want this really rather weird probability and what does it tell you? Put another way, when you say you looked at what was ‘significant’ what does this mean? Remember the probability of being a US senator if one is an American is very small, but the probability of being an American if one is a US senator is 100%. 

    #837
    Tessa Mearns
    Member

    It’s really great to actually have a conversation about this! I wish I had posted this question earlier.

    What I was trying to find out was whether the differences I had identified were just random or whether they seemed to be connected to the two different educational models (CLIL-type Bilingual Education or ‘normal’) that I had used to group the respondents. In terms of probability, I suppose that means I wanted to know whether it was more probable that a pupil studying bilingually would respond more positively (or negatively) to certain items than a pupil studying only in their native language. From my reading, it looked like Chi squared was the ‘softest’ test I could run for that, as I didn’t want to get into using means to understand Likert scale data as had been suggested to me.

    With the other type of test, I wanted to find out whether certain responses were more probable among, for example, the bilingual learners, at either the beginning or end of the school year. I used the same approach when looking at differences between responses from bilingual learners in relation to lessons taught in either English or the first language.

    #836
    Tessa Mearns
    Member

    I just realised that I’ve misunderstood the probability thing – I think.

    When you say it’s about probability, do you actually mean the difference between what the data says and what the probability is that it will say this? I remember reading that and it making sense to me.

    If that’s the case, it’s a lot clearer. I wanted to know whether the differences in the responses given by each group differed from each other to a degree that was more than probable if they had just been two random groups rather than groups separated by a specific criterion.

    In the within-group comparisons, I wanted to know whether the differences in the Round1 and Round 2 responses, or in the responses regarding English-medium or ‘normal’ lessons, differed more than might otherwise be expected.

    I hope that makes more sense.

    #835
    Stephen Gorard
    Participant

    But none of the tests mentioned (or indeed any sig test) can give you that probability. How could it? It’s not magic. Just think about it clearly. 

    They will all give you the probability of obtaining data as different as you obtained if there was really no difference between the two sets of scores (groups). This is not what you ask for (above). You want, quite properly, the probability of there being no difference between the two sets of scores given the data you actually obtained. These probabilities are completely different and the first cannot be used to compute the second. Hence the tests are useless to you. Hence you should stick to what you originally planned. Do real analysis not push-button pseudo-magical nonsense.  

    #834
    Tessa Mearns
    Member

    Thank for your input. Lots of food for thought. I’ll talk to my supervisors about it and have a think about my next steps.

    Thanks again for your time,

    Tessa

    #833
    Pat Bazeley
    Participant

    The point of asking for probabilities, as I understand it, is to make it possible to predict, with some degree of assurance, beyond the current sample to a broader population of which this sample is representative. Exploring actual differences as they exist for this sample will tell you only about this sample. The probability is about confidence in prediction, not how much difference the intervention made – the effect size can tell you something about the latter.

    #832
    Stephen Gorard
    Participant

    If ‘exploring actual differences… for this sample will only tell you about this sample’ then what can be used instead for the kind of generalisation you suggest above? It does not make sense.

     

    To summarise. If the sample achieved is not complete (no missing cases or values) or not randomly selected then the kinds of test the supervisor proposed must not not be used.

     

    In the very unilkely event that Tessa has an appropriate sample (and I have never seen or read about one in 20 years) then the probability the test will generate is that of finding a difference as large as observed (which is why we focus on the difference in the sample) assuming that the difference is only caused by random sampling. Clearly this probability is of no practical use to anyone. What Tessa would want is the probability of the difference being caused only by random sampling given the szie of the observed difference. This is a very different probability and cannot be deduced from the useless test result (even though many people pretend that it can). Hence the US senators.

Viewing 12 posts - 1 through 12 (of 12 total)
  • You must be logged in to reply to this topic.