Categories: Big Data
Social scientists are expanding the landscape of academic knowledge production by adopting online crowdsourcing techniques used by businesses to design, innovate, and produce. Researchers employ crowdsourcing for a number of tasks, such as taking pictures, writing text, recording stories, or digesting web-based data (tweets, posts, links, etc.). In an increasingly competitive academic climate, crowdsourcing offers researchers a cutting-edge tool for engaging with the public. Yet this socio-technical practice emerged as a business procedure rather than a research method and thus contains many hidden assumptions about the world which concretely affect the knowledge produced. With this comes a problematic reduction of research participants into a single, faceless crowd. This requires a critical assessment of crowdsourcing’s methodological assumptions.
In essence, crowdsourcing harnesses the time, energy, and talent of individuals – hereafter referred to as “crowd-taskers”. Crowdsourcing allows the involvement of a large number of participants and the processing of huge, unique datasets. As such, crowdsourcing is hyped as a key method of compiling and handling “big data”; able to be applied to perform interpretation, coding, and evaluation procedures. Researchers have written and illustrated books, coded masses of text data, and even created survey questions. In short, crowdsourcing has the potential for exciting new possibilities in knowledge production beyond the scope and scale of traditional research projects, whilst being useful in all stages of the research process, from design to write-up.
However, the power of crowdsourcing raises several issues. Some, such as the working conditions of crowd-taskers, are already being discussed. But others have received less attention, including the issue of quite how to transform this business practice into a sound research method; particularly the implications of crowdsourcing’s impact on how we interact with research participants, and the impact this has on the produced knowledge.
According to the alluring sales pitches of crowdsourcing platform providers such as Amazon Mechanical Turk, CrowdFlower or Zooniverse, researchers can engage with a seemingly unlimited workforce of knowledgeable, creative, globally-dispersed crowd-taskers. Crowdsourcing has been presented as an almost magical process: all the researcher has to do is input a task and – voilà! – enriched data comes back. Amazon Mechanical Turk promises you will “start receiving results in minutes”, while Workhub offered ways to “use the internet to get a year’s work done in a day.”
Crowdsourcing rhetoric draws heavily on tropes of efficiency, cost reduction, and the potential of technology to support vast networks of creativity. To describe the crowdsourcing process, CrowdFlower’s website uses the image of a launching rocket that returns with results; invoking speed, technological innovation, and a journey into the unknown. This rocket metaphor not only captures the efficiency trope that infuses crowdsourcing, but suggests a desirable opacity between researcher and contributors; they live on separate planets, too distant to have a clear picture of one another.
Indeed, the great draw of crowdsourcing is its ability to draw on large numbers of individuals. Because of the complexities of managing huge numbers of persons, crowdsourcing reduces them to one faceless crowd. Instead of having to deal with each individual member, a researcher’s interaction is with the crowd itself; this is the essence of what crowdsourcing allows. This complexity-reducing mechanism is seen as the great benefit of crowdsourcing for business, yet becomes inherently problematic when applied to research, as it contradicts the basic idea that we control who participates in our studies, either as part of our sample or as part of our team.
Researchers have tried to examine who the crowd-taskers are. Studies show the crowd neither consists entirely of amateurs nor digital experts, but is more homogenous and better educated than often pictured. Researchers further critique the often precarious working conditions of crowd-taskers. While these insights are valuable, in-depth knowledge about typical crowd-taskers does not resolve the issue of the faceless crowd in working with crowdsourcing as a method.
The “open call” nature of crowdsourcing means control over who participates is mostly beyond the researcher. The large number of participants precludes knowledge of each unique, situated individual, while at the same time the unknowable composition – and even the potential homogeneity of the crowd – challenge scientific rules of representativeness, thereby ruling out or greatly restricting the applicability of crowdsourcing for certain research questions.
With other kinds of research methods – just think about surveys vs. qualitative interviews – we have well-established ideas about the capabilities and composition of our research participants. In qualitative methods, we draw on constructivist ideas about the uniqueness and situatedness of each individual whose experiences and views of those experiences are in focus. Alternatively, quantitative methods build on positivist thought, focused on objective knowledge and random sampling, and how society, like nature, builds on absolute laws.
Crowdsourcing, however, is not tied to one set of assumptions about the crowd-taskers. Instead, the researcher’s implicit assumptions about the crowd drive the methodological design. This image steers the definition of the task, selection of a platform, incentives offered, and so on. The underlying and implied images a researcher has about the crowd is thus impactful and shapes the quality and validity of knowledge produced.
Thus the great advantage of crowdsourcing may be unproblematic for business, but raises methodological and ethical questions for academia.
When using crowdsourcing, it is the researcher’s responsibility to reflect upon the image of the crowd in order to achieve alignment between methodological assumptions, the research question, and the design of the crowdsourcing process. In current discussions on methodology, pragmatism has been on the rise as a frame of evaluation for what constitutes “good” research and how to think about research participants. In pragmatism, the focus is on reaching the most suitable procedure to answer a research question by constantly questioning, criticising, and improving what one is doing and why, in order to reach the most appropriate (note: not the most true) knowledge on which to act. This is similar to what has been called “reflexivity.”
As researchers and academic knowledge producers, we should not forget the parameters of knowledge production. We need to think about and reflect on the methodological underpinnings of new digital methods. To begin, we should reflect on who and what the “crowd” is and what this means for our particular study. To do so, we can draw on a pragmatist methodology that requires us to be candid about what we do and why, in relation to our end goal. We should remember that crowdsourcing stems from business and the structure of many commonly used platforms will shape our data.
With crowdsourcing come great opportunities, but also great responsibility.