Please note the below is a shortened version of the full job specification. For more details please refer to the full Job Description document, which can be downloaded by clicking on the ‘Download full job spec’ button above.
Project Description:
The eSTÓR data repository was initially set up as the National Relay Station (NRS) in 2019 to be the first platform to share bilingual language data nationally. It was created by researchers at the ADAPT Centre through the EU-funded European Language Resource Infrastructure (ELRI) project. On receipt of Irish government funding in 2021, the NRS was redesigned and rebranded as the eSTÓR website (www.estor.ie) in 2022. As eSTÓR, it continues to be used by people working with the Irish language to upload language data to one central location with the ultimate aim of improving translation technology for public administration nationally and across Europe.
The role of the student in this project:
The successful candidate will be primarily involved in the cleaning and development of specialised idiomatic datasets for use in NLP applications. He/She will be processing and validating collections of idioms taken from electronic dictionaries and other lexical resources, usually in XML format.
Pending quality and license, subsets of this dataset will be added to the eSTÓR collection, for use in downstream NLP applications, including machine translation.
The successful candidate will have a unique opportunity to work on building new datasets and uploading them to the first ever national digital repository for Irish language data. The eSTÓR project serves as a platform for the use of public servants, as well as supporting development of machine translation (MT) technology at the EU level. With additional specialised resources, NLP applications, including MT, can be improved and expanded for use by all members of the public. Therefore the candidate will gain an insight into:
Research skills including developing and following methodologies, reporting on results, and communicating with a larger research team