Collect Data on Social Media

By Janet Salmons, Ph.D.

Dr. Salmons is the author of Doing Qualitative Research Online, and Gather Your Data Online. Use the code COMMUNIT24 for 25% off through December 31, 2024 if you purchase research books from Sage.


Social Media and Empirical Research

From the moment social media platforms began to welcome user-generated content, researchers have looked for ways to study it. Some researchers look at these platforms as a site where they can gain insights into the perspectives of users, while others are interested in the features and functions of the platforms themselves. Some researchers want to interact with consenting participants in social media groups and forums, while others are interested in what public users choose to post unprompted by a researcher’s queries.

Researchers have been challenged to find ethical ways to use text, images, and videos users post on social media sites. Researchers are also challenged by the fact that these commercial platforms alter access policies. They are bought and sold, changing their identities, functions, and attractiveness to users. Twitter is an example; in the form of Twitter it was popular with researchers, but the new identity as X and more restrictive policies governing research mean little new scholarship is being published.

These multidisciplinary collections of open access articles explore social media characteristics and modes of study.


Social Media Platforms as Research Site

Langlois, G. (2015). What Are the Stakes in Doing Critical Research on Social Media Platforms? Social Media + Society, 1(1). https://doi.org/10.1177/2056305115591178

Abstract. One of the key challenges facing social media studies is the capacity to undertake independent, critical research. As the field of social media is massively dominated by corporate players with a vested interest in both privatizing social data and developing proprietary social analytical tools, it is now crucial to advocate for a politics of public social media research.

Pulido Rodriguez, C. M., Ovseiko, P., Font Palomar, M., Kumpulainen, K., & Ramis, M. (2021). Capturing Emerging Realities in Citizen Engagement in Science in Social Media: A Social Media Analytics Protocol for the Allinteract Study. International Journal of Qualitative Methods, 20. https://doi.org/10.1177/16094069211050163

Abstract. In the digital era, social media has become a space for the socialization and interaction of citizens, who are using social networks to express themselves and to discuss scientific advances with citizens from all over the world. Researchers are aware of this reality and are increasingly using social media as a source of data to explore citizens’ voices. In this context, the methods followed by researchers are mainly based on the content analysis using manual, automated or combined tools. The aim of this article is to share a protocol for Social Media Analytics that includes a Communicative Content Analysis (CCA). This protocol has been designed for the Horizon 2020 project Allinteract, and it includes the social impact in social media methodology. The novel contribution of this protocol is the detailed elaboration of methods and procedures to capture emerging realities in citizen engagement in science in social media using a Communicative Content Analysis (CCA) based on the contributions of Communicative Methodology (CM).

Willaert, T., Van Eecke, P., Beuls, K., & Steels, L. (2020). Building Social Media Observatories for Monitoring Online Opinion Dynamics. Social Media + Society, 6(2). https://doi.org/10.1177/2056305119898778

Did you know that the Sage journal Social Media + Society is entirely open access?

Abstract. Social media house a trove of relevant information for the study of online opinion dynamics. However, harvesting and analyzing the sheer overload of data that is produced by these media poses immense challenges to journalists, researchers, activists, policy makers, and concerned citizens. To mitigate this situation, this article discusses the creation of (social) media observatories: platforms that enable users to capture the complexities of social behavior, in particular the alignment and misalignment of opinions, through computational analyses of digital media data. The article positions the concept of “observatories” for social media monitoring among ongoing methodological developments in the computational social sciences and humanities and proceeds to discuss the technological innovations and design choices behind social media observatories currently under development for the study of opinions related to cultural and societal issues in European spaces. Notable attention is devoted to the construction of Penelope: an open, web-services-based infrastructure that allows different user groups to consult and contribute digital tools and observatories that suit their analytical needs. The potential and the limitations of this approach are discussed on the basis of a climate change opinion observatory that implements text analysis tools to study opinion dynamics concerning themes such as global warming. Throughout, the article explicitly acknowledges and addresses potential risks of the machine-guided and human-incentivized study of opinion dynamics. Concluding remarks are devoted to a synthesis of the ethical and epistemological implications of the exercise of positioning observatories in contemporary information spaces and to an examination of future pathways for the development of social media observatories.

Rada Mihalcea, PhD, Professor of Computer Science at the University of Michigan, discusses advantages and limitations of using social media to acquire data.

Facebook

Ben-David, A. (2020). Counter-archiving Facebook. European Journal of Communication, 35(3), 249-264. https://doi.org/10.1177/0267323120922069

Abstract. The article proposes archival thinking as an analytical framework for studying Facebook. Following recent debates on data colonialism, it argues that Facebook dialectically assumes a role of a new archon of public records, while being unarchivable by design. It then puts forward counter-archiving – a practice developed to resist the epistemic hegemony of colonial archives – as a method that allows the critical study of the social media platform, after it had shut down researcher’s access to public data through its application programming interface. After defining and justifying counter-archiving as a method for studying datafied platforms, two counter-archives are presented as proof of concept. The article concludes by discussing the shifting boundaries between the archivist, the activist and the scholar, as the imperative of research methods after datafication.

Dalyot, K., Rozenblum, Y., & Baram-Tsabari, A. (2022). Engagement patterns with female and male scientists on Facebook. Public Understanding of Science, 31(7), 867-884. https://doi.org/10.1177/09636625221092696

Abstract. Social networks are becoming powerful agents mediating between science and the public. Considering the public tendency to associate science with men makes investigating representations of female scientists in social media important. Here we set out to find whether the commenting patterns to text-based science communication are similar. To examine these, we collected and analyzed posts (165) and their comments (10,006) published between 2016 and 2018 on an Israeli popular science Facebook page. We examined post characteristics as well as the relevance and sentiment of comments. Several gendered differences in commenting patterns emerged. Posts published by female scientists received more irrelevant and fewer relevant comments. Female scientists received more hostile and positive comments. These findings are consistent with results of previous research, but also demonstrate a more nuanced understanding that when female scientists write using scientific jargon (usually an unwanted feature of popular science writing), they received less hostile comments and were given less advice.

Schlussel, H., & Frosh, P. (2023). The taste of video: Facebook videos as multi-sensory experiences. Convergence, 29(4), 980-996. https://doi.org/10.1177/13548565231179958

Abstract. Recipe videos are among the most viral genres of videos on social media. Yet, little research has been done on their aesthetic and formal attributes, especially on how they operate within the frameworks of the attention economy and embodied interaction specific to social media interfaces. This paper examines recipe videos published on Tasty, one of the most popular Facebook pages in the world. We analyze these videos through a three-dimensional model that integrates their semiotic characteristics (visual, auditory, and textual), their interactive and haptic qualities, and their invitation to perceptual engagement and sensorimotor response. We conclude that Facebook recipe videos are exemplary of a broader category of social media videos which we call hyper-sensory videos: these create heightened multisensory experiences that take precedence over informational use or narrative involvement. Hyper-sensory videos present a cultural response to broader questions regarding materiality, presence, and embodied relations within a highly mediated social reality.

Sobol, M., Blachnio, A., & Pasternak, K. (2023). Time in a Virtual World: Facebook Intrusion, Time Perspective, and Contents of Facebook Narratives. SAGE Open, 13(3). https://doi.org/10.1177/21582440231184860

Abstract. Online social networking sites are places where users spend a lot of their time and leave a huge number of narratives. In the virtual world of Facebook, moving in a space is limited to only a slight hand movement. The aim of this study was to examine the relationship between the level of Facebook intrusion, time perspective, and the contents of Facebook narratives. The participants were 83 Polish Facebook users, aged 18 to 62 years. We used the Facebook Intrusion Questionnaire, the Zimbardo Time Perspective Inventory, and the Linguistic Inquiry and Word Count program. The results suggest that Facebook intrusion is connected with the low future perspective of Facebook users. Moreover, the stronger the Facebook intrusion, the more swear words in Facebook narratives. There were significant positive relationships between a present fatalistic time perspective and words expressing aggression in Facebook narratives. The results were interpreted in the context of the theory of embodied cognition.

Instagram

Adriaansen, R.-J. (2020). Picturing Auschwitz. Multimodality and the attribution of historical significance on Instagram (Imaginando Auschwitz. La multimodalidad y la atribución de significado histórico en Instagram). Journal for the Study of Education and Development, 43(3), 652-681. https://doi.org/10.1080/02103702.2020.1771963

Abstract. As social media have become a prime means of communication among students, so too are they increasingly used to give meaning to the past. But while history education scholars tend to conceptualize historical meaning-making as acts of narrative emplotment, social media are multimodal by nature, and some platforms — like Instagram — prioritize images over written text. This article focuses on the question of how Instagram users attribute historical significance in posts that feature the Auschwitz-Birkenau Memorial and Museum, by analysing the various ways in which they combine images and captions to give meaning to the past. Subsequently, the potential of multimodal social media representations for history education is explored.

Caliandro, A., & Graham, J. (2020). Studying Instagram Beyond Selfies. Social Media + Society, 6(2). https://doi.org/10.1177/2056305120924779

Abstract. Of approximately 40 billion photos posted on Instagram, only 282 million are selfies—just 0.7%. Thus, for all its zeitgeisty appeal, the selfie is in fact a niche phenomenon in the larger context of Instagram genres. Noting this fact, we identified an opportunity to engage with scholars from all over the world with the challenge of proposing new theories and approaches capable of unlocking the full socio-anthropological potential of Instagram. As the articles collected in this special issue testify, Instagram has an enormous impact on people’s everyday lives on many levels—socially, culturally, economically, and politically—and so, indubitably, it deserves rigorous academic attention. Given the scale and complexity of these impacts, studying Instagram poses different kinds of challenges and many gaps are still to be filled. With this special issue we don’t claim to resolve all the issues noted above. More modestly, our goal is to kick-start a fruitful conversation among social media scholars interested in furthering Instagram studies.

Dutta, A., & Sharma, A. (2023). Netnography and instagram community: An empirical study. Business Information Review, 40(1), 33-37. https://doi.org/10.1177/02663821231157501 (Open access link)

Abstract. Netnography is a methodical tool which is picking up the pace in the online communities. It originates from the field of ethnography and anthropology. The experiences shared by the online communities and interactions of the participants are highlighted by this revelatory method. This study will observe the effects of Instagram in evolving the trust of users, activities of netnography and their use among the Instagram community. The study aims to explore the opportunities of Instagram ethnography while considering the present literature and its inspection as both a research tool and an engaging platform. This research will also give the outlook of using analytical tools like Instagram analytics or Google analytics for amalgamation of the data.

Rathnayake, C., & Ntalla, I. (2020). “Visual Affluence” in Social Photography: Applicability of Image Segmentation as a Visually Oriented Approach to Study Instagram Hashtags. Social Media + Society, 6(2). https://doi.org/10.1177/2056305120924758

Abstract. The aim of the study is to examine the applicability of image segmentation—identification of objects/regions by partitioning images—to examine online social photography. We argue that the need for a meaning-independent reading of online social photography within social markers, such as hashtags, arises due to two characteristics of social photography: (1) internal incongruence resulting from user-driven construction and (2) variability of content in terms of visual attributes, such as color combinations, brightness, and details in backgrounds. We suggest visual affluence—plenitude of visual stimuli, such as objects and surfaces containing a variety of color regions, present in visual imagery—as a basis for classifying visual content and image segmentation as a technique to measure affluence. We demonstrate that images containing objects with complex texture and background patterns are more affluent, while images that include blurry backgrounds are less affluent than others. Moreover, images that contain letters and dark, single-color backgrounds are less affluent than images that include subtle shades. Mann–Whitney U test results for ten pairs of hashtags showed that eight pairs had significant differences in visual affluence. The proposed measure can be used to encourage a “visually oriented” turn in online social photography research that can benefit from hybrid methods that are able to extrapolate micro-level findings to macro-level effects.

Reddit

Hagen, S. (2023). No Space for Reddit Spacing: Mapping the Reflexive Relationship Between Groups on 4chan and Reddit. Social Media + Society, 9(4). https://doi.org/10.1177/20563051231216960

Abstract. 4chan and Reddit have often been lumped together as similar home turfs for geeky, masculine-coded, problematic communities thriving under laissez-faire governance. However, stressing these similarities may overlook not only how the platforms have drifted apart in political-economic terms but also how their similarity encourages assertions of difference between its users. In dialogue with theories on ritual opposition and platform imaginaries, I interrogate this dialectic by tracing the relationship between groups on 4chan and Reddit. How has this relationship developed over time and between subgroups? What do the fractured, fluctuating cross-site associations teach us about the collectivity of both sites? By quantitatively mapping cross-mentions in large archives of Reddit, 4chan/b/, and 4chan/pol/, I identify a lopsided rivalry: 4channers consistently employed antagonistic phrases and stereotypes of Reddit, but 4chan’s relevance throughout Reddit is waning. I moreover find that a platform imaginary of 4chan as neutral, diverse, and unfiltered contradicts with the incessant discursive hostility of its users. The text thereby demonstrates how the collectivization of online subcultures is shaped by reflexive cross-site relations that feature a complex interplay between discursive boundary work, contrasting platform vernaculars, and political resentment.

Proferes, N., Jones, N., Gilbert, S., Fiesler, C., & Zimmer, M. (2021). Studying Reddit: A Systematic Overview of Disciplines, Approaches, Methods, and Ethics. Social Media + Society, 7(2). https://doi.org/10.1177/20563051211019004

Abstract. This article offers a systematic analysis of 727 manuscripts that used Reddit as a data source, published between 2010 and 2020. Our analysis reveals the increasing growth in use of Reddit as a data source, the range of disciplines this research is occurring in, how researchers are getting access to Reddit data, the characteristics of the datasets researchers are using, the subreddits and topics being studied, the kinds of analysis and methods researchers are engaging in, and the emerging ethical questions of research in this space. We discuss how researchers need to consider the impact of Reddit’s algorithms, affordances, and generalizability of the scientific knowledge produced using Reddit data, as well as the potential ethical dimensions of research that draws data from subreddits with potentially sensitive populations.

Richard, B., Sivo, S. A., Ford, R. C., Murphy, J., Boote, D. N., Witta, E., & Orlowski, M. (2021). A Guide to Conducting Online Focus Groups via Reddit. International Journal of Qualitative Methods, 20. https://doi.org/10.1177/16094069211012217

Abstract. Now more than ever there exists a need to conduct data collection online in a safe environment while ensuring that methodological rigor is not sacrificed. Widely available online platforms allow for text-based focus groups to be conducted quickly, easily, and efficiently, but protocols must be maintained to ensure they do not descend into casual observation of naturally occurring conversations. Various online platform options and their merits are discussed. Reddit is provided as a case study to illustrate the steps through which researchers can conduct an asynchronous online focus group. Key opportunities such as a similar quality of results, a lower cost, easier recruitment, and the ability to accommodate more sensitive topics are discussed, as well as challenges including a stigma against online focus groups, when they are most appropriate, and the potential for deviant behavior.

Rocha-Silva, T., Nogueira, C., & Rodrigues, L. (2023). Passive data collection on Reddit: a practical approach. Research Ethics, 0(0). https://doi.org/10.1177/17470161231210542

Abstract. Since its onset, scholars have characterized social media as a valuable source for data collection since it presents several benefits (e.g. exploring research questions with hard-to-reach populations). Nonetheless, methods of online data collection are riddled with ethical and methodological challenges that researchers must consider if they want to adopt good practices when collecting and analyzing online data. Drawing from our primary research project, where we collected passive online data on Reddit, we explore and detail the steps that researchers must consider before collecting online data: (1) planning online data collection; (2) ethical considerations; and (3) data collection. We also discuss two atypical questions that researchers should also consider: (1) how to handle deleted user-generated content; and (2) how to quote user-generated content. Moving on from the dichotomous discussion between what is public and private data, we present recommendations for good practices when collecting and analyzing qualitative online data.

TikTok

Bhandari, A., & Bimo, S. (2022). Why’s Everyone on TikTok Now? The Algorithmized Self and the Future of Self-Making on Social Media. Social Media + Society, 8(1). https://doi.org/10.1177/20563051221086241

Abstract. The video-sharing social media platform TikTok has experienced a rapid rise in use since its release in 2016. While its popularity is undeniable, at the first glance, it seems to offer features already available on previously existing and well-established platforms such as Instagram, YouTube, and Facebook. To understand processes of self-making on TikTok, we undertake two methods of data collection: a walkthrough of the app and its surrounding environment, and 14 semistructured participant interviews. A qualitative analysis of this data finds three distinct themes emerge: (1) awareness of the algorithm, (2) content without context, and (3) self-creation across platforms. These results show that TikTok departs from existing platforms in the model of self-making it engenders, which we term “the algorithmized self”—a complication of the pre-existing “networked self” framework.

Schellewald, A. (2023). Understanding the popularity and affordances of TikTok through user experiences. Media, Culture & Society, 45(8), 1568-1582. https://doi.org/10.1177/01634437221144562

Abstract. In this paper I discuss the affordances and popularity of the short-video app TikTok from an audience studies point of view. I do so by drawing on findings from ethnographic fieldwork with young adult TikTok users based in the United Kingdom that was conducted in 2020 and 2021. I trace how using the app, specifically scrolling through the TikTok For You Page, the app’s algorithmic content feed, became a fixed part of the everyday routines of young adults. I show how TikTok appealed to them as a convenient means of escape and relief that they were unable to find elsewhere during and beyond times of lockdown. Further, I highlight the complex nature of TikTok as an app and the active role that users play in imagining and appropriating the app’s affordances as meaningful parts of their everyday social life. Closing the paper, I reflect on future directions of TikTok scholarship by stressing the importance of situated audience studies.

Yeung A, Ng E, Abi-Jaoude E. TikTok and Attention-Deficit/Hyperactivity Disorder: A Cross-Sectional Study of Social Media Content Quality. The Canadian Journal of Psychiatry. 2022;67(12):899-906. doi:10.1177/07067437221082854

Abstract. Social media platforms are increasingly being used to disseminate mental health information online. User-generated content about attention-deficit/hyperactivity disorder (ADHD) is one of the most popular health topics on the video-sharing social media platform TikTok. We sought to investigate the quality of TikTok videos about ADHD. The top 100 most popular videos about ADHD uploaded by TikTok video creators were classified as misleading, useful, or personal experience. Descriptive and quantitative characteristics of the videos were obtained. The Patient Education Materials Assessment Tool for Audiovisual Materials (PEMAT-A/V) and Journal of American Medical Association (JAMA) benchmark criteria were used to assess the overall quality, understandability, and actionability of the videos.

Zhao, Y. (2024). TikTok and Researcher Positionality: Considering the Methodological and Ethical Implications of an Experimental Digital Ethnography. International Journal of Qualitative Methods, 23. https://doi.org/10.1177/16094069231221374

Abstract. In this article I examine the opportunities and challenges arising from an experimental digital ethnography I conducted as a digital content creator in response to social restrictions during COVID-19. To explore the perceptions and performances of masculinity among young Uzbek men in Uzbekistan, I created 50 TikTok videos between 2021 and 2022. These videos received more than 300,000 likes in total, not only significantly broadening the reach of my research recruitment but also serving as a substantial source of ethnographic data during the pandemic. Throughout the creation of these digital videos, I assumed a dual role as an agent in the research and an object of observation. This dual role underscores the agency of both researchers and the researched in navigating the digital platform, which allows for the challenging of conventional research gazes and relationships. This digital approach also unveils the complex spatial dynamics that underlie interactions in both online and offline realms, shedding light on how digital platforms can both enhance and constrain research efforts. Moreover, this article delves into the ethical implications of this experimental digital ethnography, which revolve around potential physical and mental risks to researchers, challenges related to the re-definition of research participation, and issues pertaining to obtaining informed consent. The findings provide insights and make contributions to problematising the conceptualisation of digital spaces, online communities/publics and digital ethnography. I conclude by offering insights for researchers who face restrictions in field access or are interested in studying youth culture on social media platforms, particularly in the role of a content creator, an area that has been relatively underexplored in previous research.


Sage Research Methods Community posts about online research

Previous
Previous

Teach and Learn with a Research Case: Understanding Online Discussions of Key Public Health Issues Using a Mixed-Methods Approach

Next
Next

Think Before You Share: Navigating Power Hierarchies and Decoloniality in Research