CDT-CAD: Context-Aware Deformable Transformers for End-to-End Chest Abnormality Detection on X-Ray Images

Yirui, Wu; Kong, Qiran; Zhang, Lilai; Castiglione, Aniello; Nappi, Michele; Wan, Shaohua

doi:10.1109/TCBB.2023.3258455

: Deep learning methods have achieved great success in medical image analysis domain. However, most of them suffer from slow convergency and high computing cost, which prevents their further widely usage in practical scenarios. Moreover, it has been proved that exploring and embedding context knowledge in deep network can significantly improve accuracy. To emphasize these tips, we present CDT-CAD, i.e., context-aware deformable transformers for end-to-end chest abnormality detection on X-Ray images. CDT-CAD firstly constructs an iterative context-aware feature extractor, which not only enlarges receptive fields to encode multi-scale context information via dilated context encoding blocks, but also captures unique and scalable feature variation patterns in wavelet frequency domain via frequency pooling blocks. Afterwards, a deformable transformer detector on the extracted context features is built to accurately classify disease categories and locate regions, where a small set of key points are sampled, thus leading the detector to focus on informative feature subspace and accelerate convergence speed. Through comparative experiments on Vinbig Chest and Chest Det 10 Datasets, CDT-CAD demonstrates its effectiveness in recognizing chest abnormities and outperforms 1.4% and 6.0% than the existing methods in AP50 and AR on VinBig dateset, and 0.9% and 2.1% on Chest Det-10 dataset, respectively.