The Importance Of Secondary Data

by Helen Kara

Dr. Helen Kara was the Mentor in Residence for April 2022 to focus on the topic: Be expansive: Research outside academia. Dr. Kara has been an independent researcher since 1999 and writes and teaches on research methods. She is the author of Qualitative Research for Quantitative Researchers, Research and Evaluation for Busy Students and Practitioners: A Time-Saving Guide, Creative Research Methods in the Social Sciences: A Practical Guide, and Research Ethics in the Real World: Euro-Western and Indigenous Perspectives. In 2015 Helen was the first fully independent researcher to be conferred as a Fellow of the Academy of Social Sciences. She is also a Visiting Fellow at the UK’s National Centre for Research Methods. Visit her blog. Use the code MSPACEQ223 for a 20% discount, valid until June 30, 2023, when you order books from SAGE Publishing.


My argument is this: researchers should make as much use of secondary data as possible before we even think about gathering any primary data.

Most novice researchers are taught that new research requires primary data; that original research requires data gathered for the purpose by the researcher or the research team. Most research ethics committees focus most of their efforts on protecting participants. We need to change this. I believe we should be teaching novice researchers that new/original research requires existing data to be used in new ways, and primary data should be gathered only if absolutely necessary. I would like to see research ethics committees not only asking what researchers are doing to ensure the safety and wellbeing of participants, but also requiring a statement of the work that has been done using secondary data to try to answer the research question(s), and a clear rationale for the need to go and bother people for more information.

I believe working in this way would benefit researchers, participants, and research itself. For researchers, gathering primary data can be lots of fun and is also fraught with difficulty. Carefully planned recruitment methods may not work; response rates can be low; interviewees often say what they want to say rather than answering researchers’ questions directly. For participants, research fatigue is real. Research itself would receive more respect if we made better and fuller use of data, and shouted about that, rather than gathering data we never use (or worse, reclassifying stolen artefacts and human remains as ‘data’ and refusing to return them to their communities of origin because of their ‘scientific importance’ – but that’s another story).

Some people think of secondary data as quantitative: government statistics, health prevalence data, census findings, and so on. But there is lots of qualitative secondary data too, such as historical data, archival data, and web pages current and past. Mainstream and social media provide huge quantities of secondary data (though with social media there are a number of important ethical considerations which are beyond the scope of this post).

 Of course secondary data isn’t a panacea.

There is so much data available these days that it can be hard to find what you need, particularly as it will have been gathered by people with different priorities from yours. Also, it’s frustrating when you find what you need but you can’t access it because it’s behind a paywall or it has an obstructive gatekeeper. Comparison can be difficult when different researchers, organisations, and countries gather similar information in different ways. It can be hard to understand, or detect any mistakes in, data you didn’t gather yourself, particularly if it is in large, complicated datasets. Information about how or why data was gathered or analysed is not always available, which can leave you unsure of the quality of that data.

On the plus side, the internet allows quick, easy, free access to innumerable quantitative and qualitative datasets, containing humongous amounts of data. Much of this has been collected and presented by professional research teams with considerable expertise. There is scope for historical, longitudinal, and cross-cultural perspectives, way beyond anything you could possibly achieve through primary data gathering. Gathering secondary data can save researchers a great deal of time, which means more time available for analysis and reporting. And, ethically, using secondary data reduces the burden on potential participants, and re-use of data honours the contribution of previous participants.

There are lots of resources available on using quantitative secondary data. I’m also happy to report that there is now an excellent resource on using qualitative secondary data: Qualitative Secondary Analysis, a collection of really good chapters by forward-thinking researchers edited by Kahryn Hughes and Anna Tarrant. The book includes some innovative methods, interesting theoretical approaches, and lots of guidance on the ethics of working with secondary data.

Some people think that working with secondary data has no ethical implications. This is so wrong it couldn’t be wronger. For a start, it is essential to ensure that informed consent for re-use has been obtained. If it hasn’t, either obtain such consent or don’t use the data. Then there are debates about how ethical it is to do research using secondary data about groups of people, or communities, without the involvement of representatives from those groups or communities. Also, working with secondary data can be stressful and upsetting for researchers – imagine if you were working with historical data about the Holocaust, or (as Kylie Smith does) archival data about racism in psychiatric practice in mid-20th century America. Reading about distressing topics day after day can be harmful to our emotional and mental health, and so to our physical health as well.

These are just a few of the ethical issues we need to consider in working with secondary data. Again, it is beyond the scope of this post to cover them all. So working with secondary data isn’t an easy option. Although it is different from working with primary data, it can be just as complex. I believe novice researchers should learn how to find and use secondary data, in ethical ways, before learning anything about primary data gathering and analysis.


More Sage Research Methods Community posts about using existing documents and data

Previous
Previous

Scholarly journals: What is next?

Next
Next

Ethics and Research with Practitioners