In this work, a complementary approach that adds a dynamic component to face biometrics is proposed. The dynamic appearance and the time-dependent local features characterizing the face of an individual during speech utterance are indeed considered in their spatial and temporal components. Ultimately, the aim is to capture, represent and compare facial patterns related to speech utterance, to improve biometric system dependability thanks to an intrinsically difficult to forge descriptor. The proposed approach applies the concept of dynamic texture to the domain of person identification through dynamic facial patterns modeled by means of the Volume Local Binary Pattern (VLBP) descriptor, which effectively combines local features and movement. To the aim of improving the efficiency of this technique, only the occurrences of the Local Binary Patterns related to Three Orthogonal Planes (LBP-TOP) have been considered. A deep feed forward network has been trained and optimized on video samples from the XM2VTS database concerning utterance of a given sentence. The results obtained in the recognition task performed on test video sequences confirm that the proposed approach features state-of-the-art performances with regard to accuracy and robustness of the identification.

Dependable Person Recognition by Means of Local Descriptors of Dynamic Facial Features

Castiglione A.;Grazioli G.;Nappi M.;
2019-01-01

Abstract

In this work, a complementary approach that adds a dynamic component to face biometrics is proposed. The dynamic appearance and the time-dependent local features characterizing the face of an individual during speech utterance are indeed considered in their spatial and temporal components. Ultimately, the aim is to capture, represent and compare facial patterns related to speech utterance, to improve biometric system dependability thanks to an intrinsically difficult to forge descriptor. The proposed approach applies the concept of dynamic texture to the domain of person identification through dynamic facial patterns modeled by means of the Volume Local Binary Pattern (VLBP) descriptor, which effectively combines local features and movement. To the aim of improving the efficiency of this technique, only the occurrences of the Local Binary Patterns related to Three Orthogonal Planes (LBP-TOP) have been considered. A deep feed forward network has been trained and optimized on video samples from the XM2VTS database concerning utterance of a given sentence. The results obtained in the recognition task performed on test video sequences confirm that the proposed approach features state-of-the-art performances with regard to accuracy and robustness of the identification.
2019
978-981-15-1303-9
978-981-15-1304-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4732562
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact