Noise-robust dictionary learning with slack block-Diagonal structure for face recognition
Introduction
Face recognition is the most popular topic in the field of biometrics thanks to its intuitiveness and unique advantages. Over the past few years, the research on sparse and low-rank representation has attracted a great deal of attention because of the promising performance [1], [2], [3], [4], [5], [6]. Among these, Du et al. [1] for the first time optimized the sparsity and feature statictics simultanously and formulated a hybrid sparsity and statistics based detector for high dimension hyperspectral image data. For face image recognition, the most classical sparse representation based classification (SRC) [2] algorithm aims to find a sparse representation (only a few non-zero elements) of a query sample over an over-complete dictionary. Based on the observation that the collaborative mechanism tends to play a more important role in representing a sample than sparsity, Zhang et al. [7] proposed a collaborative representation based classifier (CRC) which uses the l2 norm to replace the l1 norm in SRC. Besides, the linear regression based classification (LRC) [8] uses cla class specific training samples to represent a test sample and then classifies it to the class which leads to the minimum residual. In addition, Wang et al. [9] proposed a locality-constrained linear coding (LLC) algorithm which states that the data locality can always promote the sparsity of representation. However, sparsity-deduced algorithms may be unable to capture the global structures of data because they are designed to learn the sparsest representation sample-wise, while ignoring the relevance between samples. Fortunately, the theory of low-rank representation has been studied to solve this problem [5], [6], [10]. The low-rank based algorithms jointly explore the underlying structures and correlations beween all samples in order to generate a representation that can preserve the global structure of data as much as possible. Zhang et al. [11] proposed to use a low-rank matrix factorization technique to simultaneously implement dimensionality reduction and data clustering for hyperspectral images. The robust principal component analysis (RPCA) [12] is the most classical matrix recovery method by introducing a low-rank constraint on clean components. Wei et al. [13] introduced a structural incoherence constraint into RPCA and presented a method called low rank matrix recovery with structural incoherence (LRSI). Based on LRSI, Yin et al. [14] presented a new method that can correct the corrupted test images with a low rank projection matrix. Liu et al. [5] proposed a low-rank representation based matrix recovery algorithm (LRR) by seeking a low-rank representation in terms of a given dictionary so that the noise can be seperated from the original samples. LRR assumes the data lies in multiple subspaces rather than only one, which is more robust in handling corrupted samples compared to to RPCA. Latent LRR (LatLRR) [6] is also a feature extraction algorithm based on LRR which supposes that the observed samples can be represented by some underlying hidden samples.
It is worth noting that all the above algorithms take advantage of the original samples as the dictionary. Although their application to real-world face recognition tasks has achieved impressive results, there often exist various contaminations in collected face images in practical scenarios, i.e., the gross occlusion, illumination and real disguise, which disturb the process of data reconstruction. If we directly use the raw training samples as dictionary, the classification performance may be degraded because the class structure of the subspace is destroyed by the noise. However, selecting only clean samples as the dictionary and strictly neglecting corrupted samples will also yield poor results since these corrupted samples may include some useful discriminative information. Therefore, it is preferable to learn a compact, clean and discriminative dictionary from the original contaminated samples. Depending on the available label information, existing dictionary algorithms can be divided into two types: supervised or unsupervised dictionary learning methods.
KSVD [15] and the method of optimal directions (MOD) [16] are the classical unsupervised dictionary learning algorithm in which the noise is assumed to be drawn from a Gaussian distribution. Moreover, Chen et al. [17] proposed a mixed-noise (Laplacian and Gaussian distribution) based dictionary learning algorithm. In order to find a solution for small sample size problems, Xu et al. [18] proposed a sample-diversity and representation-effectiveness based dictionary learning (SDRERDL) algorithm exploiting data augmentation by mirroring the original samples. More recently, Zhou et al. [19] proposed a double-dictionary learning algorithm in which two different dictionaries are learned to separate the original data into different subspaces.
For supervised dictionary learning algorithms, the label information relating to the original training samples is embedded during the learning procedure to capture more discriminative structure. Yang et al. [20] used the Fisher discrimination criterion to learn Fisher discrimination dictionary (FDDL) in which the representations have both minimum within-class scatter and maximum inter-class scatter. FDDL is a very classic supervised dictionary learning algorithm. Based on the KSVD model, Zhang et al. [21] proposed a discriminative KSVD (D-KSVD) algorithm to improve the discriminative ability of the learned dictionary by incorporating a classification error term into the objective function. Furthermore, Jiang et al. [22] proposed a label-consistent K-SVD (LC-KSVD) algorithm in which the within-class samples are encouraged to have similar representations by introducing a sparse binary label matrix. Based on the assumption that the samples from different classes may share certain common features, Wang et al. [23] proposed a classification-oriented dictionary learning model (COPAR) by exploiting the particularity and commonality of information across all classes. Besides, DLSI [24], JDL [25] and CSDL [26] were also proposed to explore a shared dictionary. However, [23], [24], [25], [26] overlook an important problem, i.e., the shared dictionary should be low-rank. Inspired by this, Vu et al. [27] proposed a low-rank shared dictionary learning (LRSDL) framework by introducing a low-rank constraint on the shared dictionary to encourage its subspace to be of low-dimensionality and its corresponding representations to be similar. In addition to adding a low-rank constraint to the dictionary, there also exist a number of low-rank representation based dictionary learning algorithms [28], [29], [30]. Thanks to the self-expressiveness property, the obtained data representations should be block-diagonal for discriminability [31]. By virtue of the model of LRR, structured LRR (SLRR) [28] constructed an ideal ’0-1’ block-diagonal matrix to force the learned low-rank representation matrix to be block-diagonal. Gao et al. [30] proposed a robust and discriminative low-rank representation based dictionary learning (RDLRR) algorithm by exploiting the low-rank property of both the representations and contiguous errors.
However, the used ideal ’0-1’ block-diagonal structure in SLRR [28] and RDLRR [30] is unrealistic in practice because the within-class representation coefficients over a dictionary are not identical. If we force all the within-class coefficients to be equal to ’1’, they may lose some useful structural information, which is beneficial for classification. Recently, Zhang et al. [31] proposed a discriminative block-diagonal low-rank representation (BDLRR) algorithm by restraining the energy of off-block-diagonal elements to strengthen the contribution of block-diagonal elements. In BDLRR, the learned representations are block-diagonal but not strict ’0-1’ structure. However, BDLRR takes no account of the correlations among within-class representations [32]. The above low-rank representation based dictionary learning algorithms characterize the noise using a single distribution assumption, which is not robust to mixed-noise. To address these problems, in this paper, we first propose a slack block-diagonal (SBD) structure by adding a row-sparse slack term on the ideal ’0-1’ structure matrix. As a result, the new target matrix of the representations is dynamic, yet similar to the block-diagonal structure. From [28], [29], [30], [31] we know that by imposing a low-rank constraint on coding coefficients, we can capture the whole structure of data and the learned dictionary becomes more robust to noise. Thus, by integrating the idea of mixed-noise based learning model [17] and low-rank nature of representations, we develop a novel noise-robust dictionary learning algorithm with slack block-diagonal structure (SBDL). The main contributions of this paper can be summarized as follows:
- (1)
A low-rank representation based noise-robust dictionary learning model is proposed. In addition to learning low-rank representations for dictionary, we take advantage of the l1 and l2 norms to respectively describe the noise with a Laplacian distribution and Gaussian distribution. This model is more robust to complicated noise contaminations than the one based on a single distribution.
- (2)
Based on above mixed-noise based learning model, we propose a slack block-diagonal structure (SBD) for the representations. A slack block-diagonal structure is more flexible than strict binary block-diagonal structure. As a result it avoids the loss of structural information and contributes to learning more discriminative dictionary.
- (3)
We develop an effective optimization algorithm based on the alternating direction method of multipliers (ADMM) to solve the optimization problem of the proposed model. Its convergence is validated experimentally.
The remainder of the paper is organized as follows: Section 2 introduces the low-rank representation based noise-robust dictionary learning algorithm for removing mixed-noise. Section 3 presents our SBD2L algorithm developed by introducing a slack block-diagonal structure. In Section 4, we disuc discuss the classification method. In Section 5, we validate our SBD2L approach in extensive experiments and compare it with state-of-the-art methods on four benchmark face databases. Finally, Section 6 concludes the paper.
Section snippets
Noise-robust dictionary learning by embedding low-rank characteristics
Let denote n training samples with a dimensionality of d drawn from c classes. Each column of X, i.e., denotes a sample vector. is the matrix of samples of the ith class and . Dictionary learning (DL) methods aim to learn a compact and discriminative dictionary from the original samples X, which exhibits robustness to various types of noise in face images. is the sub-dictionary of the ith
Noise-robust dictionary learning with slack block-diagonal structure
In this section, we first present our novel slack block-diagonal structure (SBD) for noise-robust dictionary learning (SBD2L). Then, we will introduce the optimization method for the proposed SBD2L. In the end, we give a simple complexity analysis of the optimization method.
Classification
By solving Algorithm 1, the mixed noise or corruptions in the training samples can be eliminated during the structured dictionary learning process to obtain optimized dictionary and discriminative representation . For efficient classification, it is feasible to use the representation, and the labels of training samples to learn a linear classifier W [28]where is the binary label matrix of the training samples, whose column
Experiments
In this section, we conduct experiments on four benchmark face databases, i.e., the AR [43], Extended Yale B [44], CMU PIE [45] and Labeled Faces in the Wild (LFW) [46] database, to demonstrate the effectiveness of the proposed SBD2L algorithm. We compare the SBD2L algorithm with some state-of-the-art dictionary learning algorithms: LCKSVD(1 and 2) [22], FDDL [20], LCLE [47], SDRERDL [18] and several representation based classification algorithms: SRC [2], LRC [8], LLC [9]. For LCKSVD, SDRERDL,
Conclusion
This paper proposed a robust dictionary learning algorithm based on a mixed-noise model by utilizing a slack block-diagonal structure (SBD2L). Specifically, the mixed-noise based model is introduced in a dictionary learning procedure which assumes that the face images are simultaneously subject to Laplacian and Gaussian noise. The core innovation of the SBD2L algorithm is the use of a slack block-diagonal structure for the proposed representation to alleviate the structure information loss
Declaration of Competing Interest
The authors declared that they have no conflicts of interest to this work.
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Zhe Chen received the B.S. degree in the department of computer science and technology from Hefei University, Hefei, China, in 2014, and the M.S. degree from the Jiangnan University, Wuxi, in 2018. Currently, he is a PhD candidate in School of IoT Engineering, Jiangnan University, Wuxi, China. His research interests include face recognition, dictionary learning and sparse low-rank representation.
References (50)
- et al.
A sparse regularized nuclear norm based matrix regression for face recognition with contiguous occlusion
Pattern Recognit. Lett.
(2019) - et al.
Hyperspectral image unsupervised classification by robust manifold matrix factorization
Inf. Sci.
(2019) - et al.
Sample diversity, representation effectiveness and robust dictionary learning for face recognition
Inf. Sci.
(2017) - et al.
Dictionary learning with structured noise
Neurocomputing
(2018) - et al.
A classification-oriented dictionary learning model: explicitly learning the particularity and commonality across categories
Pattern Recognit.
(2014) - et al.
Multi-spectral low-rank structured dictionary learning for face recognition
Pattern Recognit.
(2016) - et al.
Learning robust and discriminative low-rank representations for face recognition with occlusion
Pattern Recognit.
(2017) - et al.
Inter-class sparsity based discriminative least square regression
Neural Netw.
(2018) - et al.
Joint discriminative dimensionality reduction and dictionary learning for face recognition
Pattern Recognit.
(2013) - et al.
Dictionary learning based impulse noise removal via l1–l1 minimization
Signal Process.
(2013)
Beyond the sparsity-based target detector: a hybrid sparsity and statistics-based detector for hyperspectral images
IEEE Trans. Image Process.
Robust face recognition via sparse representation
IEEE Trans. pattern Anal. Mach. Intell.
Joint sparse representation and multitask learning for hyperspectral target detection
IEEE Trans. Geosci. Remote Sens.
Robust recovery of subspace structures by low-rank representation
IEEE Trans. Pattern Anal. Mach. Intell.
Latent low-rank representation for subspace segmentation and feature extraction
Proceedings of the International Conference on Computer Vision
Sparse representation or collaborative representation: Which helps face recognition?
Proceedings of the International Conference on Computer Vision
Linear regression for face recognition
IEEE Trans. Pattern Anal. Mach. Intell.
Locality-constrained linear coding for image classification
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Robust low-rank recovery with a distance-measure structure for face recognition
Proceedings of the Pacific Rim International Conference on Artificial Intelligence
Robust principal component analysis?
J. ACM (JACM)
Robust face recognition with structurally incoherent low-rank matrix decomposition
IEEE Trans. Image Process.
Face recognition based on structural incoherence and low rank projection
Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning
K-Svd: an algorithm for designing overcomplete dictionaries for sparse representation
IEEE Trans. Signal Process.
Frame based signal compression using method of optimal directions (mod)
Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI, ISCAS (Cat. No. 99CH36349)
Robust dictionary learning by error source decomposition
Proceedings of the IEEE International Conference on Computer Vision
Cited by (41)
Attention-guided evolutionary attack with elastic-net regularization on face recognition
2023, Pattern RecognitionRBDL: Robust block-Structured dictionary learning for block sparse representation
2023, Pattern Recognition LettersNoise-related face image recognition based on double dictionary transform learning
2023, Information SciencesDictionary-based transfer learning with Universum data
2022, Information SciencesCitation Excerpt :For example, Zhou et al. [48] utilize the low-rank technique to preserve the subspace structure of clean data and the structured noise. Chen et al. [3] introduce a low-rank constraint into the sparse representation matrix to enhance the robustness of noise obtained from the Gaussion and Laplacian distribution. Considering transfer learning has been widely used in many areas, we need to study the problem of dictionary-based transfer learning in which Universum data are taken into account.
Enhanced nuclear norm based matrix regression for occluded face recognition
2022, Pattern Recognition
Zhe Chen received the B.S. degree in the department of computer science and technology from Hefei University, Hefei, China, in 2014, and the M.S. degree from the Jiangnan University, Wuxi, in 2018. Currently, he is a PhD candidate in School of IoT Engineering, Jiangnan University, Wuxi, China. His research interests include face recognition, dictionary learning and sparse low-rank representation.
Xiao-Jun Wu received the B.Sc. degree in mathematics from Nanjing Normal University, Nanjing, China, in 1991, and the M.S. and Ph.D. degrees in pattern recognition and intelligent systems from the Nanjing University of Science and Technology, Nanjing, in 1996 and 2002, respectively. He is currently a Professor of Artificial Intelligent and Pattern Recognition with Jiangnan University, Wuxi, China. His current research interests include pattern recognition, computer vision, fuzzy systems, neural networks, and intelligent systems.
He-Feng Yin received his B.S. degree in School of Computer Science and Technology from Xuchang University, Xuchang, China, in 2011. Currently, he is a Ph.D. candidate in School of I oT Engineering, Jiangnan University, Wuxi, China. His research interests include representation-based classification methods, dictionary learning and low-rank representation.
Josef Kittler (M’74âLM’12) received the B.A., Ph.D., and D.Sc. degrees from the University of Cambridge, in 1971, 1974, and 1991, respectively. He is currently a Professor of Machine Intelligence with the Centre for Vision, Speech and Signal Processing, Department of Electronic Engineering, University of Surrey, Guildford, U.K. He conducts research on biometrics, video and image database retrieval, medical image analysis, and cognitive vision. He has authored a textbook entitled Pattern Recognition: A Statistical Approach (Englewood Cliffs, NJ, USA: Prentice-Hall, 1982) and over 600 scientific papers. He serves on the Editorial Board of several scientific journals in pattern recognition and computer vision.