Methodspace - home of the Research Methods community

Dear Method space users I am a student and currently analyzing data collected for my thesis. I need your help on following issues: 1. Is it legitimate to use Likert scale/ Likert-type items data in parametric such as t-test, ANOVA? 2. Is it legitimate to combine (Strongly disagree + disagree, neutral, agree + strongly agree) and report as (agree, neutral and disagree) categories? Thank you in advance

Views: 405

Reply to This

Replies to This Discussion

The Likert scale with SD, D, N, A, and SA is ordinal. It is not continuous data, although many people treat it as such. Because it is ordinal, you should analyze separate questions with nonparametric statistics like Mann-Whitney-U or Wilcoxon Rank test.

There is one exception.  It is acceptable to analyze data from an ordinal Likert scale with parametric statistics if you aggregate questions (i.e., analyze the mean of two or more questions). However, before aggregating it is a good idea to make sure that the questions are similar or related to the same underlying factor. Sometimes it is pretty obvious when two or more questions measure the same underlying factor. When in doubt you should check with principal components analysis before aggregating.

I think it is fine to collapse across categories like combining SA and A into "agree" and SD and D into "disagree". Make sure you justify this approach in your write-up (i.e., explain why you collapsed). When you collapse you are definitely working with an ordinal scale and should analyze individual questions with nonparametric statistics.

Good luck. 

Dr. Dave

Thank you very much for your important suggestions.

I am not clear about aggregating questions. Is it about calculating grand mean for the items under one construct (factor)?  Is there a possibility to aggregate questions through some processes? SPSS?

Thank you 


Aggregating questions involves summing scores on 2 or more similar questions for each participant, then dividing by the number of questions you aggregated. It is an average of scores. You can then run a test on the aggregated variable representing 2 or more questions. Here is an example. 

Assume that questions 1, 2, and 3 cover the same sort of issue and you use the following numeric coding for the response options: SD(1)  D(2)   N(3)   A(4)   SA(5).

Participant     Question1        Question2        Question3        Average/Aggregated variable         

1                            4                         3                       5                       4

2                            3                         1                       2                       2

3                            3                         4                       2                       3

You don't say if you're using a computer to analyse your data, but even if you don't have access to one, or to SPSS, you might have a look at the tutorials on my website to see some worked examples of what Dr Collingridge is talking about.  Technically he is right, but most researchers treat Likert scales as interval anyway.  Also you ought to produce a correlation matrix of the items you are using to see to what extent they "hang together" or not.

See page and then work your way through.  There aren't any t-test runs, but at least you can see how the data from a survey of teenage pupils were treated to derive scores for "Attachment to status quo" and "Sexism (negative attitudes to women)"

I also have some examples replicated from Julie Pallant's "SPSS Survival Manual" in

(Mr) John Hall, [50 years in survey research]

1 - Likert data is unlikely to satisfy normal distribution assumption hence it is doubtful parametric testing can be employ.

2 - Julie Pallant have written some papers to tackle Likert data type using Rasch Model (RM).

3 - Furthermore, RM also provide empirical support / justification with regards to your question about collapsing (or combine) among the SD, D, N, A, and SA.

4 - Try this

Good Luck!

Ethical issues

Many researchers will aggregate Likert-scale items: “important” and “mostly important” , “rather important” will become simply “important”, etc. I personally believe that this approach raises at least one ethical issue:  it voids the respondent’s intentions and cognitive process.  Over the past ten years, with students attending my methodology classes, I have repeatedly tested (albeit in a not wholly scientific manner) this penchant for post-survey grouping of scale items.  I would have them answer a likert-scale questionnaire, stipulating that time was not of essence – they could take the whole class time if needed (of course, they never do).  I will then question them on how they arrived at choosing an item – what went through their mind.  To make a long story short, what inevitably comes out of the discussions is that students treat each item, not as ordinal and even less as interval, but as discrete, nominal choices. (In itself, this observation raises issues with the statistical tests being used). When I confide that usage usually groups the negative with the negative and the positive with the positive, the students are rather shocked. The usual response is: “Then why bother!  We are being asked to rightly appraise our position and you end up washing out our efforts. I took a lot of time making sure that the item I chose corresponded with what I really believe” (then follows some stronger vocabulary that I will not reproduce here.).


Gilles Valiquette

Thank you all !!


Follow us:

© 2014   Created by SAGE Publications.

Badges  |  Report an Issue  |  Terms of Service