This paper proposes a new method to select the most relevant covariates for predicting bank defaults. In particular, as bank failure is a rare event, we estimate the probability of default of financial institutions using Generalized Extreme Value regression and implement a variable selection procedure, suitable when the binary dependent variable has a smaller number of ones than zeros. The proposed procedure has some advantages. First, it does not use any penalized function, and consequently, the estimation of regularization parameters is not required. Second, it is very easy to implement, and thus, it is efficient from a computational perspective. Third, it deals with the dependence structure and works well in the presence of strong correlation in the data. We validate the variable selection procedure by a simulation study. Moreover, we apply the procedure to a dataset of Italian banks and evaluate its performance to identify the most relevant covariates that influence the failure probability. The results of both the simulation study and empirical analysis show that our proposal outperforms other variable selection approaches, such as the forward stepwise method.

A new procedure for variable selection in presence of rare events

Francesco Giordano;Marcella Niglio
;
Marialuisa Restaino
2020-01-01

Abstract

This paper proposes a new method to select the most relevant covariates for predicting bank defaults. In particular, as bank failure is a rare event, we estimate the probability of default of financial institutions using Generalized Extreme Value regression and implement a variable selection procedure, suitable when the binary dependent variable has a smaller number of ones than zeros. The proposed procedure has some advantages. First, it does not use any penalized function, and consequently, the estimation of regularization parameters is not required. Second, it is very easy to implement, and thus, it is efficient from a computational perspective. Third, it deals with the dependence structure and works well in the presence of strong correlation in the data. We validate the variable selection procedure by a simulation study. Moreover, we apply the procedure to a dataset of Italian banks and evaluate its performance to identify the most relevant covariates that influence the failure probability. The results of both the simulation study and empirical analysis show that our proposal outperforms other variable selection approaches, such as the forward stepwise method.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4737596
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact