This paper illustrates how efficient text mining may be achieved by means of syntactic ontology building, associated to morphosyntactic and terminology-based automatic textual analysis software. This paper also gives a factual contribution to the debate on Semantic Web (SW), in a perspective based on the possibility to elaborate ontologies starting from the syntactic structure in which they occur. To deal with these topics, we will here adopt Lexicon-Grammar theoretical and practical analytical framework. Four main themes will be here analyzed and discussed: 1. the formalization of semantic predicate properties, based on co-occurrence and selection restriction rules; 2. the structure and application of generic and terminological electronic dictionaries; 3. the building and application of finite-state automata and transducers; 4. the consequent incorporation of all previous lingware elements inside two specific textual analysis software, to be used in order to achieve text mining, morphosyntactic parsing, and terminological information retrieval. Keywords: data mining, semantic web, lexicon-grammar
Data Mining Modular Software System
ELIA, Annibale;VIETRI, Simonetta;POSTIGLIONE, Alberto;MONTELEONE, Mario;MARANO, FEDERICA
2010-01-01
Abstract
This paper illustrates how efficient text mining may be achieved by means of syntactic ontology building, associated to morphosyntactic and terminology-based automatic textual analysis software. This paper also gives a factual contribution to the debate on Semantic Web (SW), in a perspective based on the possibility to elaborate ontologies starting from the syntactic structure in which they occur. To deal with these topics, we will here adopt Lexicon-Grammar theoretical and practical analytical framework. Four main themes will be here analyzed and discussed: 1. the formalization of semantic predicate properties, based on co-occurrence and selection restriction rules; 2. the structure and application of generic and terminological electronic dictionaries; 3. the building and application of finite-state automata and transducers; 4. the consequent incorporation of all previous lingware elements inside two specific textual analysis software, to be used in order to achieve text mining, morphosyntactic parsing, and terminological information retrieval. Keywords: data mining, semantic web, lexicon-grammarI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.