Any researcher who has used digital archives for searching books or resources will recognise the huge benefits that new technologies have brought. When looking at historical manuscripts, a large number of collections are available in image format and require extensive manual processing in order to search through them. New research by the ADAPT Centre as part of the Beyond 2022 project will enable full-text retrieval of historical handwritten document images through a novel semantic search system. The research was recently published on Springer.com.
The researchers semantic full-text search system for images of historical handwritten manuscripts is a significant improvement on keyword search as it improves precision and recall by ‘understanding’ the user’s intent and the contextual meaning of concepts in documents and queries.
Speaking about the development, Professor Conlan said: “Current systems use keyword extraction from images however it is a challenge to recognise text with high accuracy. We wanted to go beyond the current technology to increase search performance and provide contextual references for the historian. Our novel semantic full-text search system for images of historical handwritten manuscripts is based on named entity, keyword, and knowledge graphs to help processing, storage and automatic indexing of the manuscript which will allow users to quickly access and retrieve manuscripts that are of interest to them.”
Dr Peter Crooks, Founding Director of the Beyond 2022 project and lecturer in the Department of History at Trinity College Dublin said: “This type of technology has the potential to transform how researchers engage with archival collections and we are really excited by the potential for semantic search when we are recreating the National Archives of Ireland that were lost in 1922.”
One of the primary outcomes of the Beyond 2022 project will be a fully immersive, three-dimensional virtual reality model of the digitally reconstructed Public Record Office of Ireland. This model will be used as an interactive tool for engagement and research, whereby visitors will be able to browse virtual shelves and link to substitute or salvaged records held by archives and libraries around the world.