This paper illustrates how efficient text mining may be achieved by means of syntactic ontology building, associated to morphosyntactic and terminology-based automatic textual analysis software. This paper also gives a factual contribution to the debate on Semantic Web (SW), in a perspective based on the possibility to elaborate ontologies starting from the syntactic structure in which they occur. To deal with these topics, we will here adopt Lexicon-Grammar theoretical and practical analytical framework. Four main themes will be here analyzed and discussed: 1. the formalization of semantic predicate properties, based on co-occurrence and selection restriction rules; 2. the structure and application of generic and terminological electronic dictionaries; 3. the building and application of finite-state automata and transducers; 4. the consequent incorporation of all previous lingware elements inside two specific textual analysis software, to be used in order to achieve text mining, morphosyntactic parsing, and terminological information retrieval. Keywords: data mining, semantic web, lexicon-grammar

Data Mining Modular Software System

ELIA, Annibale;VIETRI, Simonetta;POSTIGLIONE, Alberto;MONTELEONE, Mario;MARANO, FEDERICA
2010

Abstract

This paper illustrates how efficient text mining may be achieved by means of syntactic ontology building, associated to morphosyntactic and terminology-based automatic textual analysis software. This paper also gives a factual contribution to the debate on Semantic Web (SW), in a perspective based on the possibility to elaborate ontologies starting from the syntactic structure in which they occur. To deal with these topics, we will here adopt Lexicon-Grammar theoretical and practical analytical framework. Four main themes will be here analyzed and discussed: 1. the formalization of semantic predicate properties, based on co-occurrence and selection restriction rules; 2. the structure and application of generic and terminological electronic dictionaries; 3. the building and application of finite-state automata and transducers; 4. the consequent incorporation of all previous lingware elements inside two specific textual analysis software, to be used in order to achieve text mining, morphosyntactic parsing, and terminological information retrieval. Keywords: data mining, semantic web, lexicon-grammar
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/2601031
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact