The present work faces the problem of automatic classification and representation of unstructured texts into the Cultural Heritage domain. The research is carried out through a methodology based on the exploitation of machine-readable dictionaries of terminological simple words and multiword expressions. In the paper we will discuss the design and the population of a domain ontology, that enters into a complex interaction with the electronic dictionaries and a network of local grammars. A Max-Ent classifier, based on the ontology schema, aims to confer to each analyzed text an object identifier which is related to the semantic dimension of the text. Into this activity, the unstructured texts are processed through the use of the semantically annotated dictionaries in order to discover the underlying structure which facilitates the classification. The final purpose is the automatic attribution of POIds to texts on the base of the semantic features extracted into the texts through NLP strategies.

Automatic Text Classification Through Point of Cultural Interest Digital Identifiers

Catone, Maria Carmela
;
Falco, Mariacristina
;
Maisto, Alessandro
;
Pelosi, Serena
;
Siano, Alfonso
2020-01-01

Abstract

The present work faces the problem of automatic classification and representation of unstructured texts into the Cultural Heritage domain. The research is carried out through a methodology based on the exploitation of machine-readable dictionaries of terminological simple words and multiword expressions. In the paper we will discuss the design and the population of a domain ontology, that enters into a complex interaction with the electronic dictionaries and a network of local grammars. A Max-Ent classifier, based on the ontology schema, aims to confer to each analyzed text an object identifier which is related to the semantic dimension of the text. Into this activity, the unstructured texts are processed through the use of the semantically annotated dictionaries in order to discover the underlying structure which facilitates the classification. The final purpose is the automatic attribution of POIds to texts on the base of the semantic features extracted into the texts through NLP strategies.
2020
978-3-030-33508-3
978-3-030-33509-0
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4747535
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact