In the last decades, several systems based on video analysis have been proposed for automatically detecting accidents on roads to ensure a quick intervention of emergency teams. However, in some situations, the visual information is not sufficient or sufficiently reliable, whereas the use of microphones and audio event detectors can significantly improve the overall reliability of surveillance systems. In this paper, we propose a novel method for detecting road accidents by analyzing audio streams to identify hazardous situations such as tire skidding and car crashes. Our method is based on a two-layer representation of an audio stream: at a low level, the system extracts a set of features that is able to capture the discriminant properties of the events of interest, and at a high level, a representation based on a bag-of-words approach is then exploited in order to detect both short and sustained events. The deployment architecture for using the system in real environments is discussed, together with an experimental analysis carried out on a data set made publicly available for benchmarking purposes. The obtained results confirm the effectiveness of the proposed approach.

Audio surveillance of roads: a system for detecting anomalous sounds

FOGGIA, PASQUALE;SAGGESE, ALESSIA;STRISCIUGLIO, NICOLA;VENTO, Mario
2016

Abstract

In the last decades, several systems based on video analysis have been proposed for automatically detecting accidents on roads to ensure a quick intervention of emergency teams. However, in some situations, the visual information is not sufficient or sufficiently reliable, whereas the use of microphones and audio event detectors can significantly improve the overall reliability of surveillance systems. In this paper, we propose a novel method for detecting road accidents by analyzing audio streams to identify hazardous situations such as tire skidding and car crashes. Our method is based on a two-layer representation of an audio stream: at a low level, the system extracts a set of features that is able to capture the discriminant properties of the events of interest, and at a high level, a representation based on a bag-of-words approach is then exploited in order to detect both short and sustained events. The deployment architecture for using the system in real environments is discussed, together with an experimental analysis carried out on a data set made publicly available for benchmarking purposes. The obtained results confirm the effectiveness of the proposed approach.
File in questo prodotto:
File Dimensione Formato  
itsbow.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: DRM non definito
Dimensione 4.27 MB
Formato Adobe PDF
4.27 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4651577
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 120
  • ???jsp.display-item.citation.isi??? 92
social impact