Breakthrough Research Shows We Can Reduce Gender Bias in Natural Language AI

09 June 2022
Gender Bias

Machine learning algorithms function on the training data that they receive however human biases exist in language data.  Bias is a prejudice for or against a person or group that is considered unfair and it can alter how an algorithm functions and result in the same mistakes or assumptions being made over and over. Leading-edge research from the SFI ADAPT Centre for AI-Driven Digital Content Technology aims to mitigate gender bias in technology in ground-breaking new ways.  Led by ADAPT Research Engineer Nishta Jain in collaboration with Microsoft and Imperial College London, the research will be presented at the prestigious European Language Resource Association’s 13th Language Resources and Evaluation Conference (LREC) to be held from June 21-23, 2022 in Marseille, France.

In recent times, studying and mitigating gender and other biases in natural language have become important areas of research from both algorithmic and data perspectives. However, previous work in this area has proved to be costly and tedious requiring large amounts of gender balanced training data. 

Speaking about the significance of the research, ADAPT’s Research Engineer Nishtha Jain said: “From finding a romantic partner to getting that dream job, Artificial Intelligence plays a bigger role in helping shape our lives than ever before. This is the reason, we as researchers need to ensure technology is more inclusive from an ethical and socio-political standpoint. Our research is a step in the direction of making AI technology more inclusive regardless of one’s gender.”   

ADAPT’s new work leverages pre-trained deep-learning language models to reduce bias in a language generation context. The new research approach can accelerate efficacies, making this technology more affordable and less time-consuming. The approach is designed to work on multiple languages with minimal changes in the form of heuristics. To prove this, the research was successfully tested on a high resource language, namely Spanish, and a very low resource language, namely Serbian, with positive results. 

The research paper is titled Leveraging Pre-trained Language Models for Gender DebiasingCo-authors on the paper include Maja Popović (DCU), Declan Groves (Microsoft) and Lucia Specia (Imperial College London).