The effectiveness of Cross-language Information Retrieval (CLIR) applications clearly depends on the quality of translation, thus inaccurate or incorrect translations may cause serious problems in retrieving relevant information. Indeed, a very frequent source of mistranslations in specific domain texts is represented by multiword units (MWUs), and particularly, terminological word compounds: Processing and translating these forms of compound words is not a straightforward task since their morpho-syntactic and linguistic behaviour is quite complex and varied according to the various types and their translations are practically unpredictable. Our contribution presents an outline of the knowledge-based resources (dictionary, ontology and rules), developed by means of NooJ and used in the development of a knowledge-based CLIR system.
A Knowledge-Based CLIR Model for Specific Domain Collections
Monteleone Mario
;Monti Johanna;Maria Pia di Buono
2015
Abstract
The effectiveness of Cross-language Information Retrieval (CLIR) applications clearly depends on the quality of translation, thus inaccurate or incorrect translations may cause serious problems in retrieving relevant information. Indeed, a very frequent source of mistranslations in specific domain texts is represented by multiword units (MWUs), and particularly, terminological word compounds: Processing and translating these forms of compound words is not a straightforward task since their morpho-syntactic and linguistic behaviour is quite complex and varied according to the various types and their translations are practically unpredictable. Our contribution presents an outline of the knowledge-based resources (dictionary, ontology and rules), developed by means of NooJ and used in the development of a knowledge-based CLIR system.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.