The research we present in this paper focuses on the automatic management of the knowledge about experience goods and their features, starting from real texts generated online by internet users. The details about an experiment conducted on a dataset of product reviews, on which we tested a set of rule based and statistical solutions, will be described in the paper. The main goals are the review classification, the extraction of relevant product features and their systematization into product-driven ontologies. Feature extraction is performed through a rule based strategy grounded on SentIta, an Italian collection of subjective lexical resources. Features and Reviews are classified thanks to a distributional semantic algorithm. In the end, we face the problem of the extracted knowledge organization by integrating the subjective information produced by the internet users within a product-driven ontology. The NLP tool exploited in the work is LG-Starship, a hybrid framework for the on Italian texts processing based on the Lexicon-Grammar theory.

A hybrid method for the extraction and classification of product features from user-generated contents

PELOSI, SERENA;MAISTO, ALESSANDRO;GUARASCI, RAFFAELE;STINGO, MICHELE
2017

Abstract

The research we present in this paper focuses on the automatic management of the knowledge about experience goods and their features, starting from real texts generated online by internet users. The details about an experiment conducted on a dataset of product reviews, on which we tested a set of rule based and statistical solutions, will be described in the paper. The main goals are the review classification, the extraction of relevant product features and their systematization into product-driven ontologies. Feature extraction is performed through a rule based strategy grounded on SentIta, an Italian collection of subjective lexical resources. Features and Reviews are classified thanks to a distributional semantic algorithm. In the end, we face the problem of the extracted knowledge organization by integrating the subjective information produced by the internet users within a product-driven ontology. The NLP tool exploited in the work is LG-Starship, a hybrid framework for the on Italian texts processing based on the Lexicon-Grammar theory.
File in questo prodotto:
File Dimensione Formato  
6_HybridMethodExtractionClassificationProductFeatures.pdf

non disponibili

Descrizione: pdf con frontespizio e indice
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.52 MB
Formato Adobe PDF
1.52 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4697590
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact