Skip to main content
Top
Published in: International Journal of Computer Assisted Radiology and Surgery 6/2019

01-06-2019 | Original Article

Face detection in the operating room: comparison of state-of-the-art methods and a self-supervised approach

Authors: Thibaut Issenhuth, Vinkle Srivastav, Afshin Gangi, Nicolas Padoy

Published in: International Journal of Computer Assisted Radiology and Surgery | Issue 6/2019

Login to get access

Abstract

Purpose

Face detection is a needed component for the automatic analysis and assistance of human activities during surgical procedures. Efficient face detection algorithms can indeed help to detect and identify the persons present in the room and also be used to automatically anonymize the data. However, current algorithms trained on natural images do not generalize well to the operating room (OR) images. In this work, we provide a comparison of state-of-the-art face detectors on OR data and also present an approach to train a face detector for the OR by exploiting non-annotated OR images.

Methods

We propose a comparison of six state-of-the-art face detectors on clinical data using multi-view OR faces, a dataset of OR images capturing real surgical activities. We then propose to use self-supervision, a domain adaptation method, for the task of face detection in the OR. The approach makes use of non-annotated images to fine-tune a state-of-the-art detector for the OR without using any human supervision.

Results

The results show that the best model, namely the tiny face detector, yields an average precision of 0.556 at intersection over union of 0.5. Our self-supervised model using non-annotated clinical data outperforms this result by 9.2%.

Conclusion

We present the first comparison of state-of-the-art face detectors on OR images and show that results can be significantly improved by using self-supervision on non-annotated data.
Literature
1.
go back to reference Chen K, Gabriel P, Alasfour A, Gong C, Doyle WK, Devinsky O, Friedman D, Dugan P, Melloni L, Thesen T, Gonda D, Sattar S, Wang S, Gilja V (2018) Patient-specific pose estimation in clinical environments. IEEE J Transl Eng Health Med 6:1–11 Chen K, Gabriel P, Alasfour A, Gong C, Doyle WK, Devinsky O, Friedman D, Dugan P, Melloni L, Thesen T, Gonda D, Sattar S, Wang S, Gilja V (2018) Patient-specific pose estimation in clinical environments. IEEE J Transl Eng Health Med 6:1–11
2.
go back to reference Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp I–I Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp I–I
3.
go back to reference Najibi M, Samangouei P, Chellappa R, Davis LS (2017) SSH: single stage headless face detector. In: ICCV, pp 4885–4894 Najibi M, Samangouei P, Chellappa R, Davis LS (2017) SSH: single stage headless face detector. In: ICCV, pp 4885–4894
4.
go back to reference Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S\(^3\)FD: single shot scale-invariant face detector. In: International conference on computer vision (ICCV) at Venice, Italy Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S\(^3\)FD: single shot scale-invariant face detector. In: International conference on computer vision (ICCV) at Venice, Italy
5.
go back to reference Jiang H, Learned-Miller E (2017) Face detection with the faster R-CNN. In: 12th IEEE international conference on automatic face & gesture recognition (FG 2017) Jiang H, Learned-Miller E (2017) Face detection with the faster R-CNN. In: 12th IEEE international conference on automatic face & gesture recognition (FG 2017)
6.
go back to reference Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp 91–99
7.
go back to reference Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. In: CVPR Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. In: CVPR
8.
go back to reference Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1302–1310 Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1302–1310
9.
go back to reference Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: ECCV, pp 34–50 Insafutdinov E, Pishchulin L, Andres B, Andriluka M, Schiele B (2016) Deepercut: a deeper, stronger, and faster multi-person pose estimation model. In: ECCV, pp 34–50
10.
go back to reference Fang H-S, Xie S, Tai Y-W, Lu C (2017) RMPE: regional multi-person pose estimation. In: ICCV Fang H-S, Xie S, Tai Y-W, Lu C (2017) RMPE: regional multi-person pose estimation. In: ICCV
11.
go back to reference Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: ECCV Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: ECCV
12.
go back to reference Chen Y, Wang Z, Peng Y, Zhang Z, Yu G, Sun J (2018) Cascaded pyramid network for multi-person pose estimation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7103–7112 Chen Y, Wang Z, Peng Y, Zhang Z, Yu G, Sun J (2018) Cascaded pyramid network for multi-person pose estimation. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7103–7112
13.
go back to reference Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: ECCV, pp 740–755 Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: ECCV, pp 740–755
14.
go back to reference Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2D human pose estimation: new benchmark and state of the art analysis. In: CVPR, June 2014 Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2D human pose estimation: new benchmark and state of the art analysis. In: CVPR, June 2014
15.
go back to reference Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016) Multi-stream deep architecture for surgical phase recognition on multi-view RGBD videos. In: MICCAI workshop on modeling and monitoring of computer assisted interventions (M2CAI) Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016) Multi-stream deep architecture for surgical phase recognition on multi-view RGBD videos. In: MICCAI workshop on modeling and monitoring of computer assisted interventions (M2CAI)
16.
go back to reference Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1(9):691CrossRefPubMed Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1(9):691CrossRefPubMed
17.
go back to reference Yeung S, Downing NL, Fei-Fei L, Milstein A (2018) Bedside computer vision-moving artificial intelligence from driver assistance to patient safety. NEJM 378(14):1271CrossRefPubMed Yeung S, Downing NL, Fei-Fei L, Milstein A (2018) Bedside computer vision-moving artificial intelligence from driver assistance to patient safety. NEJM 378(14):1271CrossRefPubMed
18.
go back to reference Kadkhodamohammadi A, Gangi A, de Mathelin M, Padoy N (2017) Articulated clinician detection using 3D pictorial structures on RGB-D data. Med Image Anal 35:215–224CrossRefPubMed Kadkhodamohammadi A, Gangi A, de Mathelin M, Padoy N (2017) Articulated clinician detection using 3D pictorial structures on RGB-D data. Med Image Anal 35:215–224CrossRefPubMed
19.
go back to reference Kadkhodamohammadi A, Gangi A, de Mathelin M, Padoy N (2017) A multi-view RGB-D approach for human pose estimation in operating rooms. In: WACV, pp 363–372 Kadkhodamohammadi A, Gangi A, de Mathelin M, Padoy N (2017) A multi-view RGB-D approach for human pose estimation in operating rooms. In: WACV, pp 363–372
20.
go back to reference Belagiannis V, Wang X, Shitrit HB, Hashimoto K, Stauder R, Aoki Y, Kranzfelder M, Schneider A, Fua P, Ilic S, Feussner H, Navab N (2016) Parsing human skeletons in an operating room. Mach Vis Appl 27(7):1035–1046CrossRef Belagiannis V, Wang X, Shitrit HB, Hashimoto K, Stauder R, Aoki Y, Kranzfelder M, Schneider A, Fua P, Ilic S, Feussner H, Navab N (2016) Parsing human skeletons in an operating room. Mach Vis Appl 27(7):1035–1046CrossRef
21.
go back to reference Nieto-Rodríguez A, Mucientes M, Brea VM (2015) System for medical mask detection in the operating room through facial attributes. In: Iberian conference on pattern recognition and image analysis. Springer, pp 138–145 Nieto-Rodríguez A, Mucientes M, Brea VM (2015) System for medical mask detection in the operating room through facial attributes. In: Iberian conference on pattern recognition and image analysis. Springer, pp 138–145
22.
go back to reference Flouty E, Zisimopoulos O, Stoyanov D (2018) Faceoff: anonymizing videos in the operating rooms. In: OR 2.0 context-aware operating theaters, computer assisted robotic endoscopy, clinical image-based procedures, and skin image analysis. Springer, pp 30–38 Flouty E, Zisimopoulos O, Stoyanov D (2018) Faceoff: anonymizing videos in the operating rooms. In: OR 2.0 context-aware operating theaters, computer assisted robotic endoscopy, clinical image-based procedures, and skin image analysis. Springer, pp 30–38
23.
go back to reference Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407CrossRef Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407CrossRef
24.
go back to reference Srivastav V, Issenhuth T, Kadkhodamohammadi A, de Mathelin M, Gangi A, Padoy N (2018) MVOR: a multi-view rgb-d operating room dataset for 2D and 3D human pose estimation. In: MICCAI-LABELS-2018 Srivastav V, Issenhuth T, Kadkhodamohammadi A, de Mathelin M, Gangi A, Padoy N (2018) MVOR: a multi-view rgb-d operating room dataset for 2D and 3D human pose estimation. In: MICCAI-LABELS-2018
26.
go back to reference Radosavovic I, Dollár P, Girshick RB, Gkioxari G, He K (2018) Data distillation: towards omni-supervised learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4119–4128 Radosavovic I, Dollár P, Girshick RB, Gkioxari G, He K (2018) Data distillation: towards omni-supervised learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4119–4128
27.
go back to reference Hu P, Ramanan D (2017) Finding tiny faces. In: CVPR Hu P, Ramanan D (2017) Finding tiny faces. In: CVPR
28.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: ECCV. Springer, pp 21–37 Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: ECCV. Springer, pp 21–37
Metadata
Title
Face detection in the operating room: comparison of state-of-the-art methods and a self-supervised approach
Authors
Thibaut Issenhuth
Vinkle Srivastav
Afshin Gangi
Nicolas Padoy
Publication date
01-06-2019
Publisher
Springer International Publishing
Published in
International Journal of Computer Assisted Radiology and Surgery / Issue 6/2019
Print ISSN: 1861-6410
Electronic ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-019-01944-y

Other articles of this Issue 6/2019

International Journal of Computer Assisted Radiology and Surgery 6/2019 Go to the issue