Abstract
Personal authentication systems based on biometric have seen a strong demand mainly due to the increasing concern in various privacy and security applications. Although the use of each biometric trait is problem dependent, the human ear has been found to have enough discriminating characteristics to allow its use as a strong biometric measure. To locate an ear in a face image is a strenuous task, numerous existing approaches have achieved significant performance, but the majority of studies are based on the constrained environment. However, ear biometrics possess a great level of difficulties in the unconstrained environment, where pose, scale, occlusion, illuminations, background clutter, etc., vary to a great extent. To address the problem of ear detection in the wild, we have proposed two high-performance ear detection models: CED-Net-1 and CED-Net-2, which are fundamentally based on deep convolutional neural networks and primarily use contextual information to detect ear in the unconstrained environment. To compare the performance of proposed models, we have implemented state-of-the-art deep learning models, viz. FRCNN (faster region convolutional neural network) and SSD (single shot multibox detector) for ear detection task. To test the model’s generalization, these are evaluated on six different benchmark datasets, viz. IITD, IITK, USTB-DB3, UND-E, UND-J2 and UBEAR, and each one of the databases has different challenging images. The models are compared based on performance measure parameters such as IOU (intersection over union), accuracy, precision, recall and F1-score. It is observed that our proposed models CED-Net-1 and CED-Net-2 outperformed the FRCNN and SSD at higher values of IOUs. An accuracy of 99% is achieved at IOU 0.5 on majority of the databases. This performance signifies the importance and effectiveness of the models and indicates that the models are resilient to environmental conditions.
References
Abaza A, Hebert C, Harrison MAF (2010) Fast learning ear detection for real-time surveillance. In: 2010 fourth IEEE international conference on biometrics: theory, applications and systems (BTAS), pp 1–6. https://doi.org/10.1109/BTAS.2010.5634486
Iannarelli A (1989) Ear identification. Forensic identification series. Paramount Publishing Company, New York
Arunachalam M, Alagarsamy SB (2017) An efficient ear recognition system using DWT BLPOC. In: 2017 International conference on inventive communication and computational technologies (ICICCT), pp 16–19. https://doi.org/10.1109/ICICCT.2017.7975188
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615
Chen Y, Zhu X, Gong S (2019) Instance-guided context rendering for cross-domain person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 232–242
Chidananda P, Srinivas P, Manikantan K, Ramachandran S (2015) Entropy-cum-hough-transform-based ear detection using ellipsoid particle swarm optimization. Mach Vis Appl 26(2):185–203. https://doi.org/10.1007/s00138-015-0669-y
Cintas C, Quinto-Sánchez M, Acuña V, Paschetta C, de Azevedo S, de Cerqueira CCS, Ramallo V, Gallo C, Poletti G, Bortolini MC, Canizales-Quinteros S, Rothhammer F, Bedoya G, Ruiz-Linares A, Gonzalez-José R, Delrieux C (2017) Automatic ear detection and feature extraction using geometric morphometrics and convolutional neural networks. IET Biom 6(3):211–223. https://doi.org/10.1049/iet-bmt.2016.0002
Dai J, Li Y, He K, Sun J (2016) R-FCN: Object detection via region-based fully convolutional networks. In: Proceedings of the 30th international conference on neural information processing systems (NIPS’16), pp 379–387. Curran Associates Inc., USA http://dl.acm.org/citation.cfm?id=3157096.3157139
Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2019) Semantic correlation promoted shape-variant context for segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8885–8894
Emersic Z, Gabriel LL, Struc V, Peer P (2018) Convolutional encoder-decoder networks for pixel-wise ear detection and segmentation. IET Biom 7(3):175–184. https://doi.org/10.1049/iet-bmt.2017.0240
Emerišič Ž, Štruc V, Peer P (2017) Ear recognition: more than a survey. Neurocomputing 255:26–39. https://doi.org/10.1016/j.neucom.2016.08.139
Eyiokur FI, Yaman D, Ekenel HK (2018) Domain adaptation for ear recognition using deep convolutional neural networks. IET Biom 7:199–206
Ganapathi II, Prakash S, Dave IR, Bakshi S (2020) Unconstrained ear detection using ensemble-based convolutional neural network model. Concurr Comput Pract Exp 32(1):e5197. https://doi.org/10.1002/cpe.5197
Ganesh MR, Krishna R, Manikantan K, Ramachandran S (2014) Entropy based binary particle swarm optimization and classification for ear detection. Eng Appl Artif Intell 27(Supplement C):115–128. https://doi.org/10.1016/j.engappai.2013.07.022
Halawani A, Li H (2016) Human ear localization: a template-based approach. Int J Signal Process Syst 4(3):258–262. https://doi.org/10.18178/ijsps.4.3.258-262
Hayat M, Khan SH, Bennamoun M (2017) Empowering simple binary classifiers for image set based face recognition. Int J Comput Vis 123(3):479–498. https://doi.org/10.1007/s11263-017-1000-3
Hosang J, Benenson R, Dollár P, Schiele B (2016) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830. https://doi.org/10.1109/TPAMI.2015.2465908
Jain AK, Arora SS, Cao K, Best-Rowden L, Bhatnagar A (2017) Fingerprint recognition of young children. IEEE Trans Inf Forensics Secur 12(7):1501–1514. https://doi.org/10.1109/TIFS.2016.2639346
Jaswal G, Nath R, Nigam A (2017) Deformable multi-scale scheme for biometric personal identification. In: 2017 IEEE international conference on image processing (ICIP), pp 3555–3559. https://doi.org/10.1109/ICIP.2017.8296944
Jaswal G, Nigam A, Nath R (2017) Deepknuckle: revealing the human identity. Multimed Tools Appls 76(18):18955–18984. https://doi.org/10.1007/s11042-017-4475-6
Jaswal G, Poonia RC (2020) Selection of optimized features for fusion of palm print and finger knuckle-based person authentication. Expert Syst. https://doi.org/10.1111/exsy.12523
Jha RR, Jaswal G, Gupta D, Saini S, Nigam A (2020) Pixisegnet: pixel-level iris segmentation network using convolutional encoder-decoder with stacked hourglass bottleneck. IET Biom 9(1):11–24. https://doi.org/10.1049/iet-bmt.2019.0025
Jha RR, Thapar D, Patil SM, Nigam A (2017) Ubsegnet: unified biometric region of interest segmentation network. In: 2017 4th IAPR Asian conference on pattern recognition (ACPR). IEEE, pp 923–928
Jia W, Zhang B, Lu J, Zhu Y, Zhao Y, Zuo W, Ling H (2017) Palmprint recognition based on complete direction representation. IEEE Trans Image Process 26(9):4483–4498. https://doi.org/10.1109/TIP.2017.2705424
Kantorov V, Oquab M, Cho M, Laptev I (2016) Contextlocnet: context-aware deep network models for weakly supervised localization. In: European conference on computer vision. Springer, pp 350–365
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems, vol 25. Curran Associates Inc, Red Hook, pp 1097–1105
Kumar A, Hanmandlu M, Kuldeep M, Gupta HM (2011) Automatic ear detection for online biometric applications. In: 2011 third national conference on computer vision, pattern recognition, image processing and graphics, pp 146–149. https://doi.org/10.1109/NCVPRIPG.2011.69
Kumar A, Wu C (2006) Automated human identification using ear imaging. Pattern Recogn 45(3):956–968. https://doi.org/10.1016/j.patcog.2011.06.005
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. Springer, Cham, pp 21–37
Najibi M, Samangouei P, Chellappa R, Davis LS (2017) Ssh: Single stage headless face detector. In: Proceedings of the IEEE international conference on computer vision, pp 4875–4884
Nakada M, Wang H, Terzopoulos D (2017) Acfr: active face recognition using convolutional neural networks. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 35–40. https://doi.org/10.1109/CVPRW.2017.11
Nigam A, Gupta P (2013) Multimodal personal authentication system fusing palmprint and knuckleprint. In: International conference on intelligent computing. Springer, pp 188–193
Nigam A, kumar B, Triyar J, Gupta P, (2015) Iris recognition using discrete cosine transform and relational measures. Springer, Berlin, pp 506–517. https://doi.org/10.1007/978-3-319-23117-4_44
Papavasileiou I, Smith S, Bi J, Han S (2017) Gait-based continuous authentication using multimodal learning. In: 2017 IEEE/ACM international conference on connected health: applications, systems and engineering technologies (CHASE), pp 290–291. https://doi.org/10.1109/CHASE.2017.107
Patil SM, Jha RR, Nigam A (2017) IPSEGNET: deep convolutional neural network based segmentation framework for iris and pupil. In: 2017 13th international conference on signal-image technology internet-based systems (SITIS), pp 184–191. https://doi.org/10.1109/SITIS.2017.40
Peng G, Zhou G, Nguyen DT, Qi X, Yang Q, Wang S (2017) Continuous authentication with touch behavioral biometrics and voice on wearable glasses. IEEE Trans Hum–Mach Syst 47(3):404–416. https://doi.org/10.1109/THMS.2016.2623562
Pflug A, Winterstein A, Busch C (2013) Robust localization of ears by feature level fusion and context information. In: 2013 international conference on biometrics (ICB), pp 1–8. https://doi.org/10.1109/ICB.2013.6612956
Prakash S, Gupta P (2012) An efficient ear localization technique. Image Vis Comput 30(1):38–50. https://doi.org/10.1016/j.imavis.2011.11.005
Prakash S, Gupta P (2012) An efficient ear localization technique. Image Vision Comput 30(1):38–50. https://doi.org/10.1016/j.imavis.2011.11.005
Raposo R, Hoyle E, Peixinho A, Proença H (2011) Ubear: A dataset of ear images captured on-the-move in uncontrolled conditions. In: 2011 IEEE workshop on computational intelligence in biometrics and identity management (CIBIM), pp 84–90. https://doi.org/10.1109/CIBIM.2011.5949208
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates Inc., Red Hook, pp 91–99
Shafiee MJ, Chywl B, Li F, Wong A (2017) Fast YOLO: A fast you only look once system for real-time embedded object detection in video. CoRR arXiv:abs/1709.05943
Shahzad M, Liu AX, Samuel A (2017) Behavior based human authentication on touch screen devices using gestures and signatures. IEEE Trans Mob Comput 16(10):2726–2741. https://doi.org/10.1109/TMC.2016.2635643
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arXiv:abs/1409.1556
Surya Prakash Umarani Jayaraman, P.G.: Ear localization using hierarchical clustering. In: Proceedings of SPIE international defence security and sensing conference (biometric technology for human identification VI), vol 7306 (2009). https://doi.org/10.1117/12.818371
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. CoRR arXiv:abs/1409.4842
USTB: Ear recoginition laboratory: university of science and technology beijing ustb database (2004). http://www1.ustb.edu.cn/resb/en/doc/Imagedb_123_intro_en.pdf
Wahab NKA, Hemayed EE, Fayek MB (2012) Heard: An automatic human ear detection technique. In: 2012 international conference on engineering and technology (ICET), pp 1–7. https://doi.org/10.1109/ICEngTechnol.2012.6396118
Wang F, Gu Y, Liu W, Yu Y, He S, Pan J (2019) Context-aware spatio-recurrent curvilinear structure segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12648–12657
Wang Y, Wu Z, Zhang J (2016) Damaged fingerprint classification by deep learning with fuzzy feature points. In: 2016 9th international congress on image and signal processing, BioMedical engineering and informatics (CISP-BMEI), pp 280–285. https://doi.org/10.1109/CISP-BMEI.2016.7852722
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Yan P, Bowyer KW (2003) Biometric recognition using three dimensional ear shape CVRL data sets (university of notre dame und database). IEEE Trans Pattern Anal Mach Intell 29(8):1297–1308
Yang Y, Loquercio A, Scaramuzza D, Soatto S (2019) Unsupervised moving object detection via contextual information separation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 879–888
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: An advanced object detection network. In: Proceedings of the 2016 ACM on multimedia conference, MM ’16. ACM, New York, NY, USA, pp 516–520. https://doi.org/10.1145/2964284.2967274
Zhang G, Lu S, Zhang W (2019) Cad-net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans Geosci Remote Sens 57(12):10015–10024. https://doi.org/10.1109/tgrs.2019.2930982
Zhang K, Guo X, He Y, Wang X, Guo Y, Ding Q (2019) IMS-SSH: multiscale face detection method in unconstrained settings. J Electron Imaging 28(1):1–10. https://doi.org/10.1117/1.JEI.28.1.013035
Zhang Y, Jiang H, Wu B, Fan Y, Ji Q (2019) Context-aware feature and label fusion for facial action unit intensity estimation with partially labeled data. In: The IEEE international conference on computer vision (ICCV)
Zhang Y, Mu Z (2017) Ear detection under uncontrolled conditions with multiple scale faster region-based convolutional neural networks. Symmetry. https://doi.org/10.3390/sym9040053
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest:
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kamboj, A., Rani, R., Nigam, A. et al. CED-Net: context-aware ear detection network for unconstrained images. Pattern Anal Applic 24, 779–800 (2021). https://doi.org/10.1007/s10044-020-00914-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-020-00914-4