In the last decade, Convolutional Neural Networks (CNNs) have been the de facto approach for automated medical image detection. Recently, Vision Transformers have emerged in computer vision as an alternative to CNNs. Specifically, the Shifted Window (Swin) Transformer is a general-purpose backbone that learns attention-based hierarchical features and achieves state-of-the-art performances in a variety of vision tasks. In this work, for the first time, we design and experiment transformer-based models for mass detection in digital mammograms leveraging Swin transformer as a backbone multiscale feature extractor. Experiments on the largest publicly available mammography image database OMI-DB yield a True Positive Rate (TPR) of 75.7% at 0.1 False Positives per Image (FPpI) for the best transformer model, with 2.5% TPR improvement over its convolutional counterpart and a massive 7.4% TPR over the state-of-the-art. We also combine transformer- and convolution-based detectors with weighted box fusion, achieving an additional 2.4% TPR improvement reaching 78.1% TPR at 0.1 FPpI.

Transformer-based mass detection in digital mammograms

Tortorella, Francesco;
2023

Abstract

In the last decade, Convolutional Neural Networks (CNNs) have been the de facto approach for automated medical image detection. Recently, Vision Transformers have emerged in computer vision as an alternative to CNNs. Specifically, the Shifted Window (Swin) Transformer is a general-purpose backbone that learns attention-based hierarchical features and achieves state-of-the-art performances in a variety of vision tasks. In this work, for the first time, we design and experiment transformer-based models for mass detection in digital mammograms leveraging Swin transformer as a backbone multiscale feature extractor. Experiments on the largest publicly available mammography image database OMI-DB yield a True Positive Rate (TPR) of 75.7% at 0.1 False Positives per Image (FPpI) for the best transformer model, with 2.5% TPR improvement over its convolutional counterpart and a massive 7.4% TPR over the state-of-the-art. We also combine transformer- and convolution-based detectors with weighted box fusion, achieving an additional 2.4% TPR improvement reaching 78.1% TPR at 0.1 FPpI.
File in questo prodotto:
File Dimensione Formato  
Transformer_based_mass_detection_in_digital_mammograms.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Copyright dell'editore
Dimensione 7.75 MB
Formato Adobe PDF
7.75 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4817592
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact