Developing accurate and real-time algorithms for a non-invasive three-dimensional representation and reconstruction of internal patient structures is one of the main research fields in computer-assisted surgery and endoscopy. Mono and stereo endoscopic images of soft tissues are converted into a three-dimensional representation by the estimation of depth maps. However, automatic, detailed, accurate and robust depth map estimation is a challenging problem that, in the stereo setting, is strictly dependent on a robust estimate of the disparity map. Many traditional algorithms are often inefficient or not accurate. In this work, novel self-supervised stacked and Siamese encoder/decoder neural networks are proposed to compute accurate disparity maps for 3D laparoscopy depth estimation. These networks run in real-time on standard GPU-equipped desktop computers and the outputs may be used for depth map estimation using the a known camera calibration. We compare performance on three different public datasets and on a new challenging simulated dataset and our solutions outperform state-of-the-art mono and stereo depth estimation methods. Extensive robustness and sensitivity analyses on more than 30000 frames has been performed. This work leads to important improvements in mono and stereo real-time depth map estimation of soft tissues and organs with a very low average mean absolute disparity reconstruction error with respect to ground truth.
StaSiS-Net: A stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy
Bardozzo F.
;Tagliaferri R.
2022-01-01
Abstract
Developing accurate and real-time algorithms for a non-invasive three-dimensional representation and reconstruction of internal patient structures is one of the main research fields in computer-assisted surgery and endoscopy. Mono and stereo endoscopic images of soft tissues are converted into a three-dimensional representation by the estimation of depth maps. However, automatic, detailed, accurate and robust depth map estimation is a challenging problem that, in the stereo setting, is strictly dependent on a robust estimate of the disparity map. Many traditional algorithms are often inefficient or not accurate. In this work, novel self-supervised stacked and Siamese encoder/decoder neural networks are proposed to compute accurate disparity maps for 3D laparoscopy depth estimation. These networks run in real-time on standard GPU-equipped desktop computers and the outputs may be used for depth map estimation using the a known camera calibration. We compare performance on three different public datasets and on a new challenging simulated dataset and our solutions outperform state-of-the-art mono and stereo depth estimation methods. Extensive robustness and sensitivity analyses on more than 30000 frames has been performed. This work leads to important improvements in mono and stereo real-time depth map estimation of soft tissues and organs with a very low average mean absolute disparity reconstruction error with respect to ground truth.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.