Identifying who is likely to engage in entrepreneurship - both in Intention (EI) and in realized Entrepreneurial Outcome Behaviour (EOB) - is essential for designing training, incubation, and policy interventions. By implementing eight waves of data from the Global Entrepreneurship Monitor (GEM) (2014-2021) generated from 43 countries (n > 1.2 million), this paper identifies a compact and generalisable predictive framework for EI and EOB using an explicit predictive model based on interpretable tree-based machine learning (Classification and Regression Trees (CART) and Stochastic Gradient Boosting (TreeNet)). The aim of the paper is to test whether a small, theory-driven subset can predict entrepreneurial behaviour across countries and years. This is achieved through the use of a construct-level framework that embeds intention models (TPB/EEM) within micro-foundations of management (RBV: human and social capital; dynamic capabilities: perceiving-seizing-transforming) and the institutional/economic context. The results show that the resulting five-variable EI model achieves a sensitivity of 75% and a baseline logistic regression of 12%, respectively. In contrast, the six-variable model for EOB achieves a sensitivity of 82% and a baseline logistic regression of 15%. Furthermore, performance does not change across pre-pandemic and COVID-era data. Finally, the model highlights actionable levers - strengthening self-efficacy, reducing institutional frictions that slow the intention-to-action transition, and reinforcing ties with role models - supporting scalable, low-cost interventions. The approach illustrates the ability of parsimonious, transparent, and self-optimising algorithms to uncover and maintain predictive structures from immensely scaled, class-biased behavioural data, while still adhering to the underlying theory of the chosen constructs and algorithms.

A new predicting model of an entrepreneurial behaviour: interpretable machine learning on GEM data for multi-year and multi country analysis

Perano, Mirko
;
Del Regno, Claudio
2026

Abstract

Identifying who is likely to engage in entrepreneurship - both in Intention (EI) and in realized Entrepreneurial Outcome Behaviour (EOB) - is essential for designing training, incubation, and policy interventions. By implementing eight waves of data from the Global Entrepreneurship Monitor (GEM) (2014-2021) generated from 43 countries (n > 1.2 million), this paper identifies a compact and generalisable predictive framework for EI and EOB using an explicit predictive model based on interpretable tree-based machine learning (Classification and Regression Trees (CART) and Stochastic Gradient Boosting (TreeNet)). The aim of the paper is to test whether a small, theory-driven subset can predict entrepreneurial behaviour across countries and years. This is achieved through the use of a construct-level framework that embeds intention models (TPB/EEM) within micro-foundations of management (RBV: human and social capital; dynamic capabilities: perceiving-seizing-transforming) and the institutional/economic context. The results show that the resulting five-variable EI model achieves a sensitivity of 75% and a baseline logistic regression of 12%, respectively. In contrast, the six-variable model for EOB achieves a sensitivity of 82% and a baseline logistic regression of 15%. Furthermore, performance does not change across pre-pandemic and COVID-era data. Finally, the model highlights actionable levers - strengthening self-efficacy, reducing institutional frictions that slow the intention-to-action transition, and reinforcing ties with role models - supporting scalable, low-cost interventions. The approach illustrates the ability of parsimonious, transparent, and self-optimising algorithms to uncover and maintain predictive structures from immensely scaled, class-biased behavioural data, while still adhering to the underlying theory of the chosen constructs and algorithms.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4945286
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact