Social scientists have, overall, been slower to tap into the ever-increasing flow of “big data” than their peers in the physical and medical sciences. That lethargy is a tad ironic given that so much of the big data available, whether it be government administrative data or social media feeds like Twitter, don’t have to be imagined and created, but exist essentially ripe for the picking.
As Martha Sedgwick, SAGE’s head of project innovation, notes, “In the past two years over 90 percent of the world’s data has been created. The digital trails produced by us all as we go about our daily life (via smartphones, transportation, payment interactions) contain huge potential for social research. These vast data sets offer new ways to understand our world and look to solve societal problems and it looks like we are at the cusp of a major turning point in the social sciences as researchers work with these data to answer new research questions.”
Perhaps the earliest questions to address, however, are more meta: How will social science research and teaching evolve to meet the challenges and opportunities big data creates? How can we bring down barriers to make this new computational social science accessible for all social researchers? That was the subject of a panel discussion SAGE Publishing held in conjunction with the Campaign for Social Sciences as part of the recent ESRC Festival of Social Sciences 2016. The November 9 panel, titled Big Data, Social Media Research and Innovations in Research Methods, was chaired by Sedgwick and featured guests Sharon Witherspoon, head of policy, Academy of Social Sciences and Campaign for Social Science; Luke Sloan, senior lecturer at Cardiff University and deputy director of Cardiff Q-Step; and Mark Kennedy, director of the KPMG Centre for Business Analytics.
The full video appears in the video below, which in turn is followed by encapsulations of their remarks provided by each of the panel’s participants.
This promise of big data is not without its challenges, and the social sciences have been slower than other fields, like biology, astronomy and physics, in working with big data. Social researchers face a number of hurdles as they look to develop the capacity to collect and analyse these vast and varied datasets, potentially produced in real time. New tools are needed to collect and process these data, including volumes of unstructured text requiring new ways to bring together qualitative and quantitative research skills. New statistical and programming skills are needed, and are emerging both within the social sciences and through new interdisciplinary collaborations (universities like Cardiff and Imperial fostering these collaborations through new interdisciplinary research labs bringing together academics from across the social sciences with computer scientists). Secondary data available through social media channels like Twitter raise questions of representation and bias as well as questions of privacy and informed consent that require us to develop about new ethical frameworks.
‘Big data’ present exciting opportunities for social scientists to further our understanding of the world – and this understanding is a means to making it better. In the case of administrative data – data collected by government in the course of, for instance, administering benefits or tracking exam results – it can illuminate causes and linkages that are otherwise invisible. But to realize this vision the social science community needs to increase its number and data skills, to negotiate access in the face of government reluctance, and most important, to ensure that we have strong and thoughtful safeguards and ethical principles in place. This means we need to take seriously the need for ‘social consent’ and full transparency, as exemplified in the Administrative Data Research Network arrangements. If we take this seriously, it can allow us to examine big and meaty questions: about how social relations between people and environments affect individuals, or how about how culture – patterns of meaning – and social institutions interact.
Twitter presents us with a rich vein of data on opinions, attitudes and behaviors that allow us to ask new questions about the social world, but it is not without its problems. All methods and approaches have their drawbacks and for the traditional tools at the disposal of a social researcher such as surveys, focus groups and interviews, these are well explored, documented and understood. Yet Twitter is new and we need to begin by asking the basic questions around representativeness, research design and what can (and cannot) be measured. In this sense, there is a a body of work to be done around establishing what Twitter data is, what it means, how people use the platform and understanding the relationship between the individual and their online identity. Through providing case studies of how real world event manifest online, through taking what we already know about networks and applying it to retweets and mentions, through maintaining an open and honest dialogue of what works and what doesn’t we can start to make the strange familiar.
With smartphones, social media, and new markets for buying and selling the digital traces of our social and economic lifeways, we social scientists are at the threshold of a period of dramatic change — change that comes with both opportunity and challenge. With new technologies for monitoring and measuring all things social, we are only starting to assemble new high-res datasets of patterned human interaction, and these datasets hold the promise of explain rare but significant sentinel events such as those we have seen in the news this year. Quite simply, we talk about these events using words like ‘jolt’, ‘upset’, ‘eruption’, ‘shock’ and ‘crash’ because we lack theories to anticipate them. In the coming years, innovators will gradually start to better explain these events, identify their antecedents, and even predict them. Both individually and collectively, the question of social scientists is, will we be among these innovators?