Head pose estimation (HPE) represents a topic central to many relevant research fields and characterized by a wide application range. In particular, HPE performed using a singular RGB frame is particular suitable to be applied at best-frame-selection problems. This explains a growing interest witnessed by a large number of contributions, most of which exploit deep learning architectures and require extensive training sessions to achieve accuracy and robustness in estimating head rotations on three axes. However, methods alternative to machine learning approaches could be capable of similar if not better performance. To this regard, we present FASHE, an approach based on partitioned iterated function systems (PIFS) to represent auto-similarities within face image through a contractive affine function transforming the domain blocks extracted only once by a single frontal reference image, in a good approximation of the range blocks which the target image has been partitioned into. Pose estimation is achieved by finding the closest match between fractal code of target image and a reference array by means of Hamming distance. The results of experiments conducted exceed the state of the art on both Biwi and Ponting'04 datasets as well as approaching those of the best performing methods on the challenging AFLW2000 database. In addition, the applications to GOTCHA Video Dataset demonstrate that FASHE successfully operates in-the-wild.

FASHE: A FrActal based Strategy for Head pose Estimation

Bisogni, Carmen
;
Nappi, Michele;Pero, Chiara;
2021

Abstract

Head pose estimation (HPE) represents a topic central to many relevant research fields and characterized by a wide application range. In particular, HPE performed using a singular RGB frame is particular suitable to be applied at best-frame-selection problems. This explains a growing interest witnessed by a large number of contributions, most of which exploit deep learning architectures and require extensive training sessions to achieve accuracy and robustness in estimating head rotations on three axes. However, methods alternative to machine learning approaches could be capable of similar if not better performance. To this regard, we present FASHE, an approach based on partitioned iterated function systems (PIFS) to represent auto-similarities within face image through a contractive affine function transforming the domain blocks extracted only once by a single frontal reference image, in a good approximation of the range blocks which the target image has been partitioned into. Pose estimation is achieved by finding the closest match between fractal code of target image and a reference array by means of Hamming distance. The results of experiments conducted exceed the state of the art on both Biwi and Ponting'04 datasets as well as approaching those of the best performing methods on the challenging AFLW2000 database. In addition, the applications to GOTCHA Video Dataset demonstrate that FASHE successfully operates in-the-wild.
File in questo prodotto:
File Dimensione Formato  
TIP_fractal_pose_estimation_final.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Copyright dell'editore
Dimensione 5.11 MB
Formato Adobe PDF
5.11 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4758945
Citazioni
  • ???jsp.display-item.citation.pmc??? 3
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 22
social impact