Help with contrast coding

Home Forums Methodspace discussion Help with contrast coding

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
  • #1315
    Simon Kiss

    Hi there: I’m using R and  have a factor with the following levels

    test<-data.frame(party_id=sample(c(‘Green’, ‘None’, ‘Conservative’, ‘NDP’, ‘Liberal’),size=100, replace=TRUE), var1=sample(c(‘Agree’, ‘Disagree’), size=100, replace=TRUE))

    what is the best way to compare the effect of being a Green Party member on response to var1 with the effect of being a member of all other parties? Should I create a contrast matrix like this:

    contrasts(f$party_id)<-matrix(c(-0.25, 1, -0.25, -0.25, -0.25, rep(0,15)),  ncol=4)

    As I see it, this compares the level of Green Party (assigned a value of 1) with all other levels (assigned values of -0.25 respectively, summing to -1) and then filling out the matrix with zeroes (null comparisons)

    Or is it better to recode the variable into a binary:

    test$party_id2<-recode(test$party_id, “‘Green’=’Green’ ; NA=NA ; else=’Other’), levels=c(‘Green’, ‘Other’))

    with my data, I’ve actually done both, but mysteriously I’m getting different coeficients for the value of being a green party member. Can anyone shed any light? Yours, Simon Kis

    test.mod<-glm(var1~party_id, data=test, family=’binomial’)

    test.mod2<-glm(var1~party_id2, data=test, family=’binomial’)

    Dave Collingridge


    Planned contrasts take the total variance explained by the model (sans residual variance) and partitions this according to the researcher’s desired planned contrasts. In your planned contrasts you took the total model variance and separated it into two groups, (1) None+Conser+NDP+Liberal, and (2) Green. The other approach was to combine group 1 affiliations into a single group called “other” and compare it against the “Green” group. The two approaches elicit different results.

    Here are my views on this issue. While I have heard of planned contrasts being using in ANOVA GLM, I have not heard of it being used in logistic regression GLM which is what you are doing given your “family = binomial” command. As you may know, ANOVA assumes a continuous outcome and relies on an identity link to relate the predictor(s) to the outcome. Logistic regression assumes a binary outcome and relies on a logit link to relate the predictor(s) to the outcome. They are different analyses and so ANOVA planned contrasts may not be appropriate for logistic models.

    I would run a logistic model and use R auto dummy coding to compare Green against all other parties. To do this Green party should be set up as the reference/baseline category by assigning it the lowest value. After assigning Green the lowest value, try the following R code.

    mymodel <- glm(var1 ~ party_id, data=test, family = “binomial”)



Viewing 2 posts - 1 through 2 (of 2 total)
  • You must be logged in to reply to this topic.