Data Mining Modular Software System

Elia, Annibale; Vietri, Simonetta; Postiglione, Alberto; Monteleone, Mario; Marano, Federica

This paper illustrates how efficient text mining may be achieved by means of syntactic ontology building, associated to morphosyntactic and terminology-based automatic textual analysis software. This paper also gives a factual contribution to the debate on Semantic Web (SW), in a perspective based on the possibility to elaborate ontologies starting from the syntactic structure in which they occur. To deal with these topics, we will here adopt Lexicon-Grammar theoretical and practical analytical framework. Four main themes will be here analyzed and discussed: 1. the formalization of semantic predicate properties, based on co-occurrence and selection restriction rules; 2. the structure and application of generic and terminological electronic dictionaries; 3. the building and application of finite-state automata and transducers; 4. the consequent incorporation of all previous lingware elements inside two specific textual analysis software, to be used in order to achieve text mining, morphosyntactic parsing, and terminological information retrieval. Keywords: data mining, semantic web, lexicon-grammar