Because of the importance of the information conveyed by the clinical documents and owing to the large quantity of raw texts produced in the healthcare system, it became a determinant challenge, in the NLP research field, to arrange the extraction and the management of meaningful data, starting from real text occurrences. In this paper we approach a corpus of 5000 medical diagnoses with sophisticated linguistic and computational devices, which are able to access the semantic dimension of words and sentences contained in it. Our morphosemantic method is grounded on a list of neoclassical formative elements pertaining to the medical domain which has been used for the automatic creation and population of medical lexical resources. The outcomes of this work are automatically built electronic dictionaries and thesauri and an annotated corpus for the NLP in the medical domain.
File in questo prodotto:
Non ci sono file associati a questo prodotto.