Irish researchers are working with colleagues across the EU to tackle language inequality for minority languages online.
Siri cannot speak Irish. Neither can the autocorrect on your phone. Whether it’s Irish, Croatian, Lithuanian, Basque or Maltese, technologically-aided communication is much more difficult in some languages than in others such as English or German. Maybe this was trivial ten years ago, but in our increasingly digital world, the situation is believed to pose grave risks to the future of the Irish language and other European languages.
There are currently more than 21 European languages in danger of ‘digital extinction’, when a language becomes less relevant in daily digital life and subsequently becomes less spoken offline; an already shrinking space. In 2018, the European Parliament passed the Language Equality in the Digital Age resolution, which led to the establishment of the European Language Equality Project (ELE), a 52-partner research programme led by ADAPT DCU and coordinated by ADAPT deputy director Professor Andy Way, which aims to help Europe achieve full digital language equality by 2030.
“It is ambitious,” says ADAPT DCU Research Fellow Dr Teresa Lynn, an expert in computational linguistics and author of the ELE’s Irish Language Report which was published earlier this month. “If you’re in the field, you know that means there’s a lot of work to be done in the next eight years.”
“Right now we’re identifying the gaps for over 80 EU languages in order to get a clear picture of how things stand, before setting out a strategic agenda and roadmap. Everything in this space happens for English first, because the investment usually comes from the big tech companies; new technologies are always driven by market demands.”
Relying on the market in this context however, was never going to work out for comparatively smaller languages. In the Irish language context, where all Irish speakers also speak English, Gaeilgeoirí are forced out of using Irish in many contexts due to the technical difficulties that come with it.
Dr Lynn says “people will shift to English and say, ‘autocorrect or predictive text drives me mad’ just because it doesn’t recognise Irish, or ‘I can’t speak to Siri or Alexa in Irish, I’ll have to speak in English instead’ and, slowly, you get this language shift happening – unwillingly maybe unknowingly. “So, as much as people might want to use Irish in their daily lives, the technology is almost forcing them into English. That’s just a simple example of what’s going to happen across Europe, if something doesn’t change in terms of the technologies made available.”
Web campaign tools like SurveyMonkey, according to Dr Lynn, have options like a “smart assistant component, where it will analyse your campaign, read what you put into the emails you’re sending and come back and say, ‘these are the key words for a successful campaign, these are the important things, or you should shorten the phrase’, and so on”.
That’s fine if your campaign is in English, because it’s able to understand English and make these recommendations.
“This type of language AI doesn’t exist for Irish,” she continues “you might think that’s no big deal now but in the future, it is likely to be. And we should ask ourselves why can’t those who want to do campaigns in Irish avail of the same advantages that those who are doing it in English avail of?”
Getting algorithms to learn how to deal with a morphologically rich and unique language like Irish so they can perform translations, auto-generate subtitles, give good search results or automate help systems and so on requires a lot of work. One of the reasons it doesn’t exist in Irish is that much of these kinds of tools are built using machine learning, where the AI systems are trained in how to perform these linguistic tasks. For example, machine translation systems need to have first seen huge amounts of previously translated text that’s professionally translated in order to make new translation predictions.
In addition, for other tools, the training text or speech data might need to be given extra linguistic information which would require annotators trained in linguistics. There are very few people qualified to do this for the Irish language. What’s worrying is not necessarily the situation Irish speakers find themselves in right now. What’s worrying is how much digital language inequality has grown so far and the potential it has to develop further.
According to the ELE’s Technology Deep Dive paper published in February, it is predicted that by 2025, 50% of knowledge workers will use an AI-based virtual assistant, a technology available for major European languages but not currently for minority ones like Irish. In a non-Irish context, the ELE points out how this would exacerbate an economic divide where some countries will gain advantage while others, who don’t have a major European language commonly spoken in their country will lag behind.
“This is why the European Commission are saying okay, we have to do something to address this because technology investment can’t just be market-driven alone,” reminds Dr Lynn. “But it’s not as simple as people sitting back and waiting for this inequality to disappear now that the ELE exists,” she says. “There needs to be a mindset change of Irish people realising that this really is an issue, of younger Irish speakers taking an interest and saying I want to study computer science because I want to build these systems for Irish.”
The ELE knows the urgency of this situation, and understands that the power to change things also lies in the hands of people – those with the tech expertise who can build these systems in different languages and those who want their language supported.
“Take Ireland for example, you’ve got large multinational tech companies enjoying the advantages that come with having Irish EU headquarters, but really, are they taking enough interest in supporting the local Irish language?” says Dr Lynn.
“Since the markets work off demand, she asks “is there also enough demand coming from Irish people saying, why is our language not being considered?”
Hear Dr Lynn speak on The Good Information Project’s Open Newsroom webinar panel on this and other challenges to language equity in the EU>
Read Dr. Lynn’s full Irish Language report here >