Annotating Social Media Data From Vulnerable Populations: Evaluating Disagreement Between Domain Experts and Graduate Student Annotators
- Desmond Patton, Social Work- SAFElab, Columbia University, New York, New York, United States
- Philipp Blandfort, TUK, Kaiserslautern, Germany
- William Frey, Social Work- SAFElab, Columbia University, New York, New York, United States
- Michael Gaskell, Social Work- SAFElab, Columbia University, New York, New York, United States
- Svebor Karaman, Electrical Engineering/Computer Science, Columbia University, New York, New York, United States
AbstractResearchers in computer science have spent considerable time developing methods to increase the accuracy and richness of annotations. However, there is a dearth in research that examines the positionality of the annotator, how they are trained and what we can learn from disagreements between different groups of annotators. In this study, we use qualitative analysis, statistical and computational methods to compare annotations between Chicago-based domain experts and graduate students who annotated a total of 1,851 tweets with images that are a part of a larger corpora associated with the Chicago Gang Intervention Study, which aims to develop a computational system that detects aggression and loss among gang-involved youth in Chicago. We found evidence to support the study of disagreement between annotators and underscore the need for domain expertise when reviewing Twitter data from vulnerable populations. Implications for annotation and content moderation are discussed.
Return to previous page