This paper is focused on Cataloga, a software package based on Lexicon-Grammar theoretical and practical analytical framework and embedding a lingware module built on compressed terminological electronic dictionaries. We will here show how Cataloga can be used to achieve efficient data mining and information retrieval by means of lexical ontology associated to terminology-based automatic textual analysis. Also, we will show how accurate data compression is necessary to build efficient textual analysis software. Therefore, we will here discuss the creation and functioning of a software for semantic-based terminological data mining, in which a crucial role is played by Italian simple and compound-word electronic dictionaries. Lexicon-Grammar is one of the most profitable and consistent methods for natural language formalisation and automatic textual analysis; it was set up by French linguist Maurice Gross during the ‘60s, and subsequently developed for and applied to Italian by Annibale Elia, Emilio D’Agostino and Maurizio Martinelli. Basically, Lexicon-Grammar establishes morphosyntactic and statistical sets of analytic rules to read and parse large textual corpora. The analytical procedure here described will prove itself appropriate for any type of digitalised text, and will represent a relevant support for the building and implementing of Semantic Web (SW) interactive platforms. http://www.computer.org/portal/web/csdl/abs/proceedings/ccp/2011/4528/00/pccp201100toc.htm http://www.computer.org/csdl/proceedings/ccp/2011/4528/00/4528a153-abs.html

CATALOGA: a Software for Semantic-Based Terminological Data Mining

ELIA, Annibale;POSTIGLIONE, Alberto;MONTELEONE, Mario
2011-01-01

Abstract

This paper is focused on Cataloga, a software package based on Lexicon-Grammar theoretical and practical analytical framework and embedding a lingware module built on compressed terminological electronic dictionaries. We will here show how Cataloga can be used to achieve efficient data mining and information retrieval by means of lexical ontology associated to terminology-based automatic textual analysis. Also, we will show how accurate data compression is necessary to build efficient textual analysis software. Therefore, we will here discuss the creation and functioning of a software for semantic-based terminological data mining, in which a crucial role is played by Italian simple and compound-word electronic dictionaries. Lexicon-Grammar is one of the most profitable and consistent methods for natural language formalisation and automatic textual analysis; it was set up by French linguist Maurice Gross during the ‘60s, and subsequently developed for and applied to Italian by Annibale Elia, Emilio D’Agostino and Maurizio Martinelli. Basically, Lexicon-Grammar establishes morphosyntactic and statistical sets of analytic rules to read and parse large textual corpora. The analytical procedure here described will prove itself appropriate for any type of digitalised text, and will represent a relevant support for the building and implementing of Semantic Web (SW) interactive platforms. http://www.computer.org/portal/web/csdl/abs/proceedings/ccp/2011/4528/00/pccp201100toc.htm http://www.computer.org/csdl/proceedings/ccp/2011/4528/00/4528a153-abs.html
2011
978-1-4577-1458-0
978-0-7695-4528-8
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/3036449
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact