26th November 2015 at 4:26 am #527Simon KissMember
I’m writing out of a bit of a sense of desperation. A colleague and I have collected survey responses from a decent sample of Canadian journalists and parliamentarians. We asked respondents to rank competing definitions of “open government”. So, we asked respondents to rank 7 definitions from most preferred to least preferred rather than have them rate the definitions as Likert items in part because we thought that differences might only emerge underneath a surface level of consensus, so we wanted to force choices.
Of course, we did this without a clear idea of the methodological problems involved in analyzing ranked data. I think I had visions of doing a straight component analysis to uncover the correlations, but, of course, as I understand it, this won’t work on correlations derived from ranked items.
So, I’ve done some basic non-parametric analysis testing differences in medians among groups (Kruskal-Wallis test) and initial hypotheses are borne out, but I’d like the analysis to go further.
However, I’ve had a high level of difficulty finding a user friendly guide and software package for this kind of test. I have read this paper (Allison, P D, and N A Christakis. “Logit Models for Sets of Ranked Items.” Sociological Methodology, 1994.) and it is promising, particularly as I’m quite familiar with event history analysis. However, I don’t quite understand how to set the data up properly to conduct such an analysis. The code underlying this analysis is also in Stata. I speak almost strictly R.
I’ve also found this R package (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665468/) but I find the package extremely poorly documented. There’s also this package https://cran.r-project.org/web/packages/Rankcluster/vignettes/Rankcluster.pdf but it relies on Bayesian clustering methods, which is far beyond my statistical training.
I could also fit a multinomial logistic regression model, modelling the odds that each person picked a definition as rank 1, but that throws away a lot of potentially interesting information about what definitions people ranked second, third or fourth and what definitions tend to go together in the different groups.
Ultimately, I’d like to know reliably the different groups rank the different definitions differently, but beyond this, I do have a series of hypotheses (and corresponding covariates) about for example the relationship between ideology and age and definitions of open government. But to start to get at those, I need to have some form of regression or factor analysis that is compatible with ranked data.
So my specific questions are:
1. Is there anyone who has done the kind of event history analysis of ranked data that Allison and Christakis advise that could walk me through how to set the data up properly?
2. Am I overthinking this problem? I’d like to find a balance between analytic rigour and sharing a reasonably thorough analysis of these findings.
3. Does any one know of any user-friendly packages, particularly in R, that are meant for ranked data?
Thank you for your time.
- You must be logged in to reply to this topic.