We are kicking off a three-month focus on data analysis, starting with Analyzing Words, Pictures, and Numbers in July. This month we will have the opportunity to learn new ideas and practical skills from Mentors in Residence Stephen Gorard, Jean Breny, and Shannon McMorrow. Find the unfolding series through this link.
In this post, Dr. Ann Sloan Devlin, author of The Research Experience, discusses important steps in data analysis for quantitative studies. See her earlier post about the design process and two sample chapters from the book.
After researchers have been through the process of formulating a research question, designing a study, getting IRB or ethics board approval, and collecting data, they come to the point of data analysis. This is a juncture where it should be relatively easy to know what to do next, but that seldom seems to be the case. Data analysis should be straightforward if researchers have taken the time to clearly articulate their research hypothesis(ses). If a research hypothesis is clearly stated at the time the study is designed, the hypothesis itself directs the researcher to the appropriate analysis (or at least category of analyses).
Consider the following hypotheses: A) There will be differences in participants’ responses vs. B) There will be significant differences between the experimental and control groups measured by their responses on the Physician Qualities Scale. If the hypothesis is stated explicitly (as in B) and, in an experiment (or quasi-experiment) contains mention of both independent (grouping) and dependent (outcome) variables, this will guide the researcher to think about the kinds of analyses that examine such differences between groups (e.g., t-test, anova, manova). In the case of correlation, where the researcher wants to understand the relationship of responses on two or more measures for the sample as a whole, the hypothesis should contain the names of scales and predicted direction (e.g., for the sample as a whole, there will be a positive correlation between scores on the Visual Engagement Scale and the Arousal Scale). In the case of proportion, which involves nominal data, the dimensions should be included (e.g., athletes and non-athletes [athlete status] will significantly differ in the proportion who own a car or not [car ownership]).
Stating the hypothesis comes long before you are ready to conduct data analyses. Another aspect of data analysis that requires planning is the series of decisions a researcher makes about what data are replaced or excluded altogether. Prior planning for these decisions and reporting them in the manuscript relates to the concept of transparency. Others who read your work need to know the decisions you made about your dataset. For example, prior to conducting analyses, the researcher needs to examine the data set for missing data. But before examining those data, the researcher should articulate the rules that will govern such issues as whether to replace missing data (yes or no), the extent of missing data from an individual participant that triggers removal from one measure (e.g., more than 10% of items on a single scale), the point at which a given participant would miss so many items across measures to be excluded entirely; and whether those who fail a manipulation check (in the case of an experiment) are excluded from data analysis altogether. The field lacks consensus about such issues as whether to replace missing data or how much missing data would be too much to replace for a given participant, but researchers do agree about the need to be transparent in the decision rules they use. The manuscript should state how implementing those rules impacted the data retained for analyses.
In The Research Experience: Planning, Conducting, and Reporting Research (2nd ed.), Chapter 12 (Organizing Data and Analyzing Results) includes material on both the order in which hypotheses should be evaluated and the steps necessary to prepare a data set for data analysis. Appendix A presents a Decision Tree for Statistical Analysis that explains which statistical test is appropriate for which kind of research question. Other useful chapters related to these topics include Chapter 3, Research Design Approaches and Issues: An Overview, and Chapter 5, Measurement: Qualities of Measures.