Abstract
A majority of the image set based face recognition methods use a generatively learned model for each person that is learned independently by ignoring the other persons in the gallery set. In contrast to these methods, this paper introduces a novel method that searches for discriminative convex models that best fit to an individual’s face images but at the same time are as far as possible from the images of other persons in the gallery. We learn discriminative convex models for both affine and convex hulls of image sets. During testing, distances from the query set images to these models are computed efficiently by using simple matrix multiplications, and the query set is assigned to the person in the gallery whose image set is closest to the query images. The proposed method significantly outperforms other methods using generatively learned convex models in terms of both accuracy and testing time, and achieves the state-of-the-art results on six of the eight tested datasets. Especially, the accuracy improvement is significant on the challenging PaSC, COX, IJB-C and ESOGU video datasets.
Similar content being viewed by others
Notes
Source code is available online at http://mlcv.ogu.edu.tr/softwarepoly.html.
Codes are available online at http://mlcv.ogu.edu.tr/softwaredcm.html.
References
Bennett, K. P., & Bredensteiner, E. J. (2000). Duality and geometry in svm classifiers. In International Conference on Machine Learning
Beveridge, J. R., Philiphs, P. J., Bolme, D. S., Draper, B. A., Given, G. H, Lui, Y. M., Teli, M. N., Zhang, H., Scruggs, W. T., & et al. Bowyer, K. W. (2013). The challenge of face recognition from digital point-and-shoot cameras. In Biometrics: Theory, Applications and Systems
Cao, K., Rong, Y., Li, C., Tang, X., & Loy, C. C. (2018). Pose-robust face recognition via deep residual equivariant mapping. In Conference on Computer Vision and Pattern Recognition.
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2018). Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018).
Cevikalp, H., & Triggs, B. (2010). Face recognition based on image sets. In IEEE Conference on Computer Vision and Pattern Recognition.
Cevikalp, H., & Triggs, B. (2017). Polyhedral conic classifiers for visual object detection and classification. In Conference on Computer Vision and Pattern Recognition.
Cevikalp, H., & Yavuz, H. S. (2017). Fast and accurate face recognition with image sets. In International Conference on Computer Vision Workshops.
Cevikalp, H., Yavuz, H. S., & Triggs, B. (2019). Face recognition based on videos by using convex hulls. IEEE Transactions on Circuits and Systems for Video Technology.
Chen, J.-C., Patel, V. M., & Chellappa, R. (2016). Unconstrained face verification using deep cnn features. In 2016 IEEE winter conference on applications of computer vision.
Chen, Y.-C., Patel, V. M., Shekhar, S., Chellappa, R., & Phillips, P. J. (2013). Video-based face recognition via joint sparse representation. In IEEE International Conference on Automatic Face and Gesture Recognition
Chen, S., Sanderson, C., Harandi, M. T., & Lovell, B. C. (2013). Improved image set classification via joint sparse approximated nearest subspaces. In IEEE conference on computer vision and pattern Recognition
Cimen, E., Ozturk, G., & Gerek, O. N. (2018). Incremental conic functions algorithm for large scale classification problems. Digital Signal Processing, 77, 187–194.
Crosswhite, N., Byrne, J., Stauffer, C., Parkhi, O., Cao, Q., & Zisserman, A. (2017). Template adaptation for face verification and identification. In International Conference on Automatic Face and Gesture Recognition.
Cui, Z., Chang, H., Shan, S., Ma, B., & Chen, X. (2014). Joint sparse representation for video-based face recognition. Neurocomputing, 135, 306–312.
Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Gasimov, R. N., & Ozturk, G. (2006). Separation via polyhedral conic functions. Optimization Methods and Software, 21, 527–540.
Hamm, J., & Lee, D. (2008). Grassmann discriminant analysis: a unifying view on subspace-based learning. In The International Conference on Machine Learning.
Hayat, M., Bennamoun, M., & An, S. (2014). Learning non-linear reconstruction models for image set classification. In IEEE Conference on Computer Vision and Pattern Recognition.
Hayat, M., Bennamoun, M., & An, S. (2014). Reverse training: an efficient approach for image set classification. In European Conference on Computer Vision.
Hayat, M., Bennamoun, M., & An, S. (2015). Deep reconstruction models for image set classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 713–727.
Hayat, M., Khan, S., & Bennamoun, M. (2017). Empowering simple binary classifiers for image set based face recognition. International Journal of Computer Vision, 123, 479–498.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition.
Hu, Y., Mian, A. S., & Owens, R. (2012). Face recognition using sparse approximated nearest points between image sets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(3), 1992–2004.
Huang, Z., Wang, R., Shan, S., Li, X., & Chen, X. (2015). Log-euclidean metric learning on symmetric positive definite manifold with application to image set classification. In International Conference on Machine Learning.
Huang, Z., Shan, S., Wang, R., Zhang, H., Lao, S., Kuerban, A., et al. (2015). A benchmark and comparative study of video-based face recognition on cox face database. IEEE Transactions on Image Processing, 24, 5967–5981.
Huang, Z., Wang, R., Li, X., Liu, W., Shan, S., Van Gool, L., et al. (2018). Geometry-aware similarity learning on SPD manifolds for visual recognition. IEEE Transactions on Circuits and Systems for Video Technology, 28, 1318–1322.
Huang, Z., Wang, R., Shan, S., Van Gool, L., & Chen, X. (2018). Cross euclidean-to-riemannian metric learning with application to face recognition from video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 2827–2840.
Kim, M., Kumar, S., Pavlovic, V., & Rowley, H. (2008). Face tracking and recognition with visual constraints in real-world videos. In IEEE Conference on Computer Vision and Pattern Recognition.
Klare, B. F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., & Jain, A. K. (2015). Pushing the frontiers of unconstrained face detection and recognition: Iarpa janus benchmark a. In Conference on Computer Vision and Pattern Recognition.
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In International Conference on Computer Vision.
Liu, Y., Yan, J., & Ouyang, W. (2017). Quality aware network for set to set recognition. In IEEE Conference on Computer Vision and Pattern Recognition.
Liu, L., Zhang, L., Liu, H., & Yan, S. (2014). Toward large-opulation face identification in unconstrained videos. IEEE Transactions on Circuits and Systems for Video Technology, 24, 1874–1884.
Maze, B., Adams, J., Duncan, J. A., Kalka, N., Miller, T., Otto, C., et al. (2018). Iarpa janusbenchmark - c: Face dataset and protocol. In International Conference on Biometrics (ICB).
Mian, A., Hu, Y., Hartley, R., & Owens, R. (2013). Image set based face recognition using self-regularized non-negative coding and adaptive distance metric learning. IEEE Transactions on Image Processing, 22, 5252–5262.
Ng, H.-W., & Winkler, S. (2014). A data-driven approach to cleaning large face datasets. In IEEE International Conference on Image Processing.
Rao, Y., Lin, J., Lu, J., & Zhou, J. (2017). Learning discriminative aggregation network for video-based face recognition. In IEEE Conference on Computer Vision.
Rao, Y., Lin, J., Lu, J., & Zhou, J. (2017). Learning discriminative aggregations network for video-based face recognition. In International Conference on Computer Vision.
Sankaranarayanan, S., Alavi, A., & Chellappa, R. (2017). Triplet probabilistic embedding for face recognition and clustering. In arXiv preprint arXiv:1604.05417.
Shi, Y., & Jain, A. (2019). Probabilistic face embeddings. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
Wang, R., & Chen, X. (2009). Manifold discriminant analysis. In IEEE Conference on Computer Vision and Pattern Recognition.
Wang, R., Guo, H., Davis, L. S., & Dai, Q. (2012). Covariance discriminative learning: A natural and efficient approach to image set classification. In Conference on Computer Vision and Pattern Recognition.
Wang, R., Shan, S., Chen, X., & Gao, W. (2008). Manifold-manifold distance with application to face recognition based on image sets. In IEEE Conference on Computer Vision and Pattern Recognition
Wang, T., & Shi, P. (2009). Kernel grassmannian distances and discriminant analysis for face recognition from image sets. Pattern Recognition Letters, 30, 1161–1165.
Wang, W., Wang, R., Shan, S., & Chen, X. (2017). Prototype discriminative learning for image set classification. IEEE Signal Processing Letters, 24, 1318–1322.
Wang, W., Wang, R., Huang, Z., Shan, S., & Chen, X. (2018). Discriminant analysis on riemannian manifold of gaussian distributions for face recognition with image sets. IEEE Transactions on Image Processing, 27, 151–163.
Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision.
Wolf, L., Hassner, T., & Maoz, I. (2011). Face recognition in unconstrained videos with matched background similarity. In Conference on Computer Vision and Pattern Recognition.
Wu, Y., Minoh, M., & Mukunoki, M. (2013). Collaboratively regularized nearest points for set based recognition. In British Machine Vision Conference
Xie, W., & Zisserman, A. (2018). Multicolumn networks for face recognition. In British Machine Vision Conference (BMVC).
Yalcin, M., Cevikalp, H., & Yavuz, H. S. (2015). Towards large-scale face recognition based on videos. In International Conference on Computer Vision Workshop on Video Summarization for Large-Scale Analytics
Yamaguchi, O., Fukui, K., & Maeda, I. (1998). Face recognition using temporal image sequence. In International Symposium of Robotics Research.
Yang, J., Ren, P., Zhang, D., Chen, D., Wen, F., Li, H., & Hua, G. (2017). Neural aggregation network for video face recognition. In IEEE Conference on Computer Vision and Pattern Recognition
Yang, M., Zhu, P., Van Gool, L., & Zhang, L. (2013). Face recognition based on regularized nearest points between image sets. In IEEE International Conference and Workshops on Automatic Face and Gesture Recognition.
Yang, M., Wang, X., Liu, W., & Shen, L. (2017). Joint regularized nearest points for image based face recognition. Image and Vision Computing, 58, 47–60.
Zhu, P., Zuo, W., Zhang, L., Shiu, S. C.-K., & Zhang, D. (2014). Image set-based collaborative representation for face recognition. IEEE Transactions on Information Forensics and Security, 9, 1120–1132.
Acknowledgements
This work was supported by the Scientific and Technological Research Council of Turkey (TUBİTAK) under grant number EEEAG-118E294.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ming-Hsuan Yang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cevikalp, H., Dordinejad, G.G. Video Based Face Recognition by Using Discriminatively Learned Convex Models. Int J Comput Vis 128, 3000–3014 (2020). https://doi.org/10.1007/s11263-020-01356-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-020-01356-5