The use of social media has grown exponentially in recent years up to become a reflection of human social attitudes and to represent today the main channel for conducting discussions and sharing opinions. For this reason, the vast amount of information generated is often used for predicting outcomes of real-world events in different fields, including business, politics, and health, as well as in the entertainment industry. In this paper, we focus on how data from Twitter can be used to predict ratings of a large set of TV shows regardless of their specific genre. Given a show, the idea is to exploit features concerning the prerelease hype on Twitter for rating predictions. We propose a novel machine learning-based approach to the genre-independent TV show popularity prediction problem. We compared the performance of several well-known predictive methods, and as a result, we discovered that LSTM and Random Forest can predict the ratings in the USA entertainment market, with a low mean squared error of 0.058. Furthermore, we tested our model by using data of "never seen" shows, by deriving interesting results in terms of error rates. Finally, we compared performance against relevant solutions available in the literature, with discussions about challenges arousing from the analysis of shows in different languages.

TV shows popularity prediction of genre-independent TV series through machine learning-based approaches

Maria Elena Cammarano;Alfonso Guarino;Delfina Malandrino;Rocco Zaccagnino
2024-01-01

Abstract

The use of social media has grown exponentially in recent years up to become a reflection of human social attitudes and to represent today the main channel for conducting discussions and sharing opinions. For this reason, the vast amount of information generated is often used for predicting outcomes of real-world events in different fields, including business, politics, and health, as well as in the entertainment industry. In this paper, we focus on how data from Twitter can be used to predict ratings of a large set of TV shows regardless of their specific genre. Given a show, the idea is to exploit features concerning the prerelease hype on Twitter for rating predictions. We propose a novel machine learning-based approach to the genre-independent TV show popularity prediction problem. We compared the performance of several well-known predictive methods, and as a result, we discovered that LSTM and Random Forest can predict the ratings in the USA entertainment market, with a low mean squared error of 0.058. Furthermore, we tested our model by using data of "never seen" shows, by deriving interesting results in terms of error rates. Finally, we compared performance against relevant solutions available in the literature, with discussions about challenges arousing from the analysis of shows in different languages.
2024
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4860803
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact