Probabilistic Approaches for Sentiment Analysis: Latent Dirichlet Allocation for Ontology Building and Sentiment Extraction

Colace, Francesco; De Santo, Massimo; Greco, Luca; Moscato, Vincenzo; Picariello, Antonio

doi:10.1007/978-3-319-30319-2_4

People’s opinion has always driven human choices and behaviors, even before the diffusion of Information and Communication Technologies. Thanks to the World Wide Web and the widespread of On-Line collaborative tools such as blogs, focus groups, review web sitesorums, social networks, millions of messages appear on the web, which is becoming a rich source of opinioned data. Sentiment analysis refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in documents, comments and posts. The aim of this work is to show how the adoption of a probabilistic approach based on the Latent Dirichlet Allocation (LDA) as Sentiment Grabber can be an effective Sentiment Analyzer. Through this approach, for a set of documents belonging to a same knowledge domain, a graph, the Mixed Graph of Terms, can be automatically extracted. This graph, which contains a set of Mixed Graph of Terms, can be transformed in a Sentiment Oriented Terminological Ontology thanks to a methodology that involves the introduction of annotated lexicon as Wordnet. The chapter shows how the obtained ontology can be discriminative for sentiment classification. The proposed method has been tested in different contexts: standard datasets and comments extracted from social networks. The experimental evaluation shows how the proposed approach is effective and the results are quite satisfactory.