The detection of persons from videos is particularly important in many computer vision contexts being an enabling technology for several relevant applications either for security and safety or for business intelligence purposes. The adoption of a depth sensor mounted in a top-view position is often used to achieve high person detection accuracy as it allows to cope effectively with occlusions and difficult lighting conditions. In this paper, we propose a new method for people detection from depth maps produced by sensors mounted in a zenithal position. The method is designed with the aim of providing an optimal trade off between the detection accuracy and the computational complexity. The proposed approach adopts a dynamic background modeling strategy in order to find the objects of interest into the scene; then a lightweight algorithm is used to filter out the noise from the foreground image and to determine the position of the persons into the scene. The experimental analysis carried out on a public and large dataset allowed to demonstrate that the method is fast and accurate. The method has been compared with respect to two different approaches available in the literature for people detection from a depth camera mounted in a zenithal position: an unsupervised method that is fast although not highly accurate, and a supervised one that conversely is very accurate but less computationally efficient. The proposed method allows to achieve comparable accuracy of the supervised approach using very few computational resources, with a reduction of an order of magnitude of the processing times.

An efficient and effective method for people detection from top-view depth cameras

Vincenzo Carletti;Luca Del Pizzo;Gennaro Percannella
;
Mario Vento
2017

Abstract

The detection of persons from videos is particularly important in many computer vision contexts being an enabling technology for several relevant applications either for security and safety or for business intelligence purposes. The adoption of a depth sensor mounted in a top-view position is often used to achieve high person detection accuracy as it allows to cope effectively with occlusions and difficult lighting conditions. In this paper, we propose a new method for people detection from depth maps produced by sensors mounted in a zenithal position. The method is designed with the aim of providing an optimal trade off between the detection accuracy and the computational complexity. The proposed approach adopts a dynamic background modeling strategy in order to find the objects of interest into the scene; then a lightweight algorithm is used to filter out the noise from the foreground image and to determine the position of the persons into the scene. The experimental analysis carried out on a public and large dataset allowed to demonstrate that the method is fast and accurate. The method has been compared with respect to two different approaches available in the literature for people detection from a depth camera mounted in a zenithal position: an unsupervised method that is fast although not highly accurate, and a supervised one that conversely is very accurate but less computationally efficient. The proposed method allows to achieve comparable accuracy of the supervised approach using very few computational resources, with a reduction of an order of magnitude of the processing times.
978-1-5386-2939-0
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4702278
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 2
social impact