NaDiRa short study: Racism in plenary debates

Deep Learning for the detection of racism in plenary debates - a feasibility study

National Monitoring of Discrimination and Racism (NaDiRa)

Running time October 2020 until December 2020
Status Completed project

Project team:

  • Andreas Blätte
  • Laura Dinnebier
  • Simon Gelhar
  • Emilia Blank
  • Silvia Mommertz

- - -

Project description:

The study deals with the question of what the methodological, technical and conceptual prerequisites are for recording racist utterances and the discursive foundations of racism with text analysis procedures. The focus is on the automated recognition of racist remarks in plenary debates. The study thus explores whether automated text analytic procedures can contribute to the further development of racism research.

Results:

An important conceptual interim result of the project is a system of categories. It subsumes core features of a definition of racism as well as racist topoi from the literature into three main categories on the basis of which we captured potentially racist sentences: 1) racialisation/essentialisation/naturalisation; 2) inferiority; 3) deviance and threateningness. Based on this category system, we generated a training dataset with over 20,000 annotated text passages, totalling over 1,500 passages classified as racist.

Initial assessments of the manual coding indicate that MPs who belong to right-wing extremist or so-called right-wing populist groups in particular make racist statements. Furthermore, a steady increase in racist statements in German state parliaments can be observed, especially after 2015.

Surprising insights:

Our finding that it is mainly right-wing extremist and right-wing populist parliamentarians who make racist statements contradicts the assumption that racist discourses are becoming increasingly normalised. However, in order to train the classification algorithm that is supposed to automatically recognise racist statements, we needed a large amount of training data containing racist and non-racist statements alike. For this reason, the sampling is based on word embeddings. Therefore, it contains mainly more explicit racist statements and hardly any more subtle forms.

Significance for practice:

The project provides the technical and methodological expertise to (re)evaluate racism over long periods of time. In addition, we make our research data (programme codes, manual annotations) available to other researchers via the networks and dissemination channels tested in the PolMine project (open science repository Zenodo, GitHub) - within the framework of existing licensing agreements. For example, we have prepared the workflow for outputting the annotation data in a suitable format as an R package (annotask) and are making it available to other researchers in a timely manner on the PolMine GitHub. Also made available is the annotated training dataset.

Short studies in preparation for the racism monitor:

In order to prepare a comprehensive racism monitor, DeZIM called on researchers from the DeZIM research community in 2020 to develop innovative study ideas. These should extend existing research projects, pursue new and innovative approaches or build an infrastructure to research racism. By 2021, more than 120 researchers at the six locations of the DeZIM research community had conducted a total of 34 short studies. These are divided into six thematic priorities:

  • Health system
  • Education system and labour market
  • Institutional racism
  • Dealing with experiences of racism
  • Participation and the media
  • Racist ideologies and attitudes

Funding: Federal Ministry for Education, Family Affairs, Senior Citizens, Women and Youth (Third-party funding)