Abstract
Recently, there is an increasing demand for automatically detecting anatomical landmarks which provide rich structural information to facilitate subsequent medical image analysis. Current methods related to this task often leverage the power of deep neural networks, while a major challenge in fine tuning such models in medical applications arises from insufficient number of labeled samples. To address this, we propose to regularize the knowledge transfer across source and target tasks through cross-task representation learning. The proposed method is demonstrated for extracting facial natomical landmarks which facilitate the diagnosis of fetal alcohol syndrome. The source and target tasks in this work are face recognition and landmark detection, respectively. The main idea of the proposed method is to retain the feature representations of the source model on the target task data, and to leverage them as an additional source of supervisory signals for regularizing the target model learning, thereby improving its performance under limited training samples. Concretely, we present two approaches for the proposed representation learning by constraining either final or intermediate model features on the target model. Experimental results on a clinical face image dataset demonstrate that the proposed approach works well with few labeled data, and outperforms other compared approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Astley, S.J.: Palpebral fissure length measurement: accuracy of the FAS facial photographic analysis software and inaccuracy of the ruler. J. Popul. Ther. Clin. Pharmacol. 22(1), e9–e26 (2015)
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: IEEE International Conference on Automatic Face Gesture Recognition, pp. 67–74. IEEE (2018)
Chen, R., Ma, Y., Chen, N., Lee, D., Wang, W.: Cephalometric landmark detection by attentive feature pyramid fusion and regression-voting. In: Medical Image Computing and Computer Assisted Intervention (MICCAI), pp. 873–881 (2019)
Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information, pp. 1–10 (2016)
Dhar, P., Singh, R.V., Peng, K.C., Wu, Z., Chellappa, R.: Learning without memorizing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5138–5146. IEEE (2019)
Feng, Z.H., Kittler, J., Awais, M., Huber, P., Wu, X.J.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2235–2245. IEEE (2018)
Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2827–2836. IEEE (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: Conference on Neural Information Processing Systems (NeurIPS) Workshops, (2015)
Hoyme, H.E., May, P.A., Kalberg, W.O., et al.: A practical clinical approach to diagnosis of fetal alcohol spectrum disorders: clarification of the 1996 institute of medicine criteria. Pediatr. 115(1), 39–47 (2006)
Huang, R., Suttie, M., Noble, J.A.: An automated CNN-based 3D anatomical landmark detection method to facilitate surface-based 3D facial shape analysis. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI) Workshops, pp. 163–171 (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proc. of International Conference on Learning Representations (ICLR), pp. 1–15 (2015)
Li, X., et al.: DELTA: deep learning transfer using feature map with attention for convolutional networks. In: Proc. of International Conference on Learning Representations (ICLR), pp. 1–13 (2019)
Li, X., Grandvalet, Y., Davoine, F.: Explicit inductive bias for transfer learning with convolutional networks. Int. Conf. Mach. Learn. (ICML). 80, 2830–2839 (2018)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2018)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision (ECCV), pp. 483–499 (2016)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3967–3976. IEEE (2019)
Patra, A., et al.: Efficient ultrasound image analysis models with sonographer gaze assisted distillation. In: Proc. of Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 394–402 (2019)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. In: Proc. of International Conference on Learning Representations (ICLR), pp. 1–13 (2015)
Wiles, O., Koepke, A., Zisserman, A.: Self-supervised learning of a facial attribute embedding from video. In: British Machine Vision Conference (BMVC), (2018)
Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: European Conference on Computer Vision (ECCV), pp. 472–487 (2018)
Zhang, J., Liu, M., Shen, D.: Detecting anatomical landmarks from limited medical imaging data using two-stage task-oriented deep neural networks. IEEE Trans. Image Process. 26(10), 4753–4764 (2017)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 918–930 (2016)
Zhao, Y., Liu, Y., Shen, C., Gao, Y., Xiong, S.: MobileFAN: transferring deep hidden representation for face alignment. Pattern Recogn. 100, 107–114 (2020)
Zhong, Z., Li, J., Zhang, Z., Jiao, Z., Gao, X.: An attention-guided deep regression model for landmark detection in cephalograms. In: Proc. of Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 540–548 (2019)
Acknowledgements
This work was done in conjunction with the Collaborative Initiative on Fetal Alcohol Spectrum Disorders (CIFASD), which is funded by grants from the National Institute on Alcohol Abuse and Alcoholism (NIAAA). This work was supported by NIH grant U01AA014809 and EPSRC grant EP/M013774/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fu, Z., Jiao, J., Suttie, M., Noble, J.A. (2020). Cross-Task Representation Learning for Anatomical Landmark Detection. In: Liu, M., Yan, P., Lian, C., Cao, X. (eds) Machine Learning in Medical Imaging. MLMI 2020. Lecture Notes in Computer Science(), vol 12436. Springer, Cham. https://doi.org/10.1007/978-3-030-59861-7_59
Download citation
DOI: https://doi.org/10.1007/978-3-030-59861-7_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59860-0
Online ISBN: 978-3-030-59861-7
eBook Packages: Computer ScienceComputer Science (R0)