Abstract
There are always varieties of inherent relational structures in the observations, which is crucial to perform multi-output regression task for high-dimensional data. Therefore, this paper proposes a new multi-output regression method, simultaneously taking into account three kinds of relational structures, \(i.e. \), the relationships between output and output, feature and output, sample and sample. Specially, the paper seeks the correlation of output variables by using a low-rank constraint, finds the correlation between features and outputs by imposing an \(\ell _{2,1}\)-norm regularization on coefficient matrix to conduct feature selection, and discovers the correlation of samples by designing the \(\ell _{2,1}\)-norm on the loss function to conduct sample selection. Furthermore, an effective iterative optimization algorithm is proposed to settle the convex objective function but not smooth problem. Finally, experimental results on many real datasets showed the proposed method outperforms all comparison algorithms in aspect of aCC and aRMSE.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Anderson, T.W.: Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Stat. 22, 327–351 (1951)
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)
Bache, K., Lichman, M.: Uci machine learning repository (2015)
Borchani, H., Varando, G., Bielza, C., Larrañaga, P.: A survey on multi-output regression. Wiley Interdisc. Rev. Data Mining Knowl. Discov. 5(5), 216–233 (2015)
Cai, X., Ding, C., Nie, F., Huang, H.: On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1124–1132 (2013)
Cai, X., Nie, F., Cai, W., Huang, H.: New graph structured sparsity model for multi-label image annotations, pp. 801–808 (2013)
Cands, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2008)
Cao, J., Wu, Z., Wu, J.: Scaling up cosine interesting pattern discovery: a depth-first method. Inf. Sci. 266(5), 31–46 (2014)
Cao, J., Wu, Z., Wu, J., Xiong, H.: Sail: Summation-based incremental learning for information-theoretic text clustering. IEEE Trans. Cybern. 43(2), 570–584 (2013)
Chang, X., Nie, F., Yang, Y., Huang, H.: A convex formulation for semi-supervised multi-label feature selection. In: AAAI Conference on Artificial Intelligence, pp. 1171–1177 (2014)
Cheng, B., Liu, G., Wang, J., Huang, Z., Yan, S.: Multi-task low-rank affinity pursuit for image segmentation. In: International Conference on Computer Vision, pp. 2439–2446 (2011)
Džeroski, S., Demšar, D., Grbović, J.: Predicting chemical parameters of river water quality from bioindicator data. Appl. Intell. 13(1), 7–17 (2000)
Gao, L., Song, J., Nie, F., Yan, Y.: Optimal graph learning with partial tags and multiple features for image and video annotation. In: CVPR (2015)
Gao, L., Song, J., Shao, J., Zhu, X., Shen, H.: Zero-shot image categorization by image correlation exploration. In: ICMR, pp. 487–490 (2015)
Gower, J.C., Dijksterhuis, G.B.: Procrustes problems. Oxford University Press (2004)
Izenman, A.J.: Reduced-rank regression for the multivariate linear model. J. Multivar. Anal. 5(2), 248–264 (1975)
Karali, A., Bratko, I.: First order regression. Mach. Learn. 26(26), 147–176 (1997)
Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint l2,1-norms minimization. In: Conference on Neural Information Processing Systems 2010, pp. 1813–1821 (2010)
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Appl. Intell. 27(1), 79–88 (2007)
Rai, P., Kumar, A., Iii, H.D.: Simultaneously leveraging output and task structures for multiple-output regression. In: Advances in Neural Information Processing Systems, pp. 3185–3193 (2012)
Rothman, A.J., Ji, Z.: Sparse multivariate regression with covariance estimation. J. Comput. Graphical Stat. 19(4), 947–962 (2010)
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-label classification methods for multi-target regression. Computer Science (2014)
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-target regression via input space expansion: treating targets as inputs. Mach. Learn., 1–44 (2016)
Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Shen, L.: Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In: IEEE International Conference on Computer Vision, pp. 557–562 (2010)
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. (TOIS) 22(3), 381–405 (2004)
Wu, X., Zhang, C., Zhang, S.: Database classification for multi-database mining. Inf. Syst. 30(1), 71–88 (2005)
Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Zhang, C., Qin, Y., Zhu, X., Zhang, J., Zhang, S.: Clustering-based missing value imputation for data preprocessing. In: IEEE International Conference on Industrial Informatics, pp. 1081–1086 (2006)
Zhang, S., Cheng, D., Zong, M., Gao, L.: Self-representation nearest neighbor search for classification. Neurocomputing 195, 137–142 (2016)
Zhang, S., Li, X., Zong, M., Cheng, D., Gao, L.: Learning k for knn classification. ACM Trans. Intell. Syst. Technol. (2016, Accepted)
Zhang, S., Qin, Z., Ling, C.X., Sheng, S.: “missing is useful”: Missing values in cost-sensitive decision trees. IEEE Trans. Knowl. Data Eng. 17(12), 1689–1693 (2005)
Zhang, S., Wu, X., Zhang, C.: Multi-database mining. IEEE Comput. Intell. Bull. 2(1), 5–13 (2003)
Zhang, S., Zhang, C., Yang, Q.: Data preparation for data mining. Appl. Artif. Intell. 17(5–6), 375–381 (2003)
Zhang, S., Zhang, J., Zhang, C.: Edua: an efficient algorithm for dynamic database mining. Inf. Sci. 177(13), 2756–2767 (2007)
Zhu, X., Huang, Z., Cheng, H., Cui, J., Shen, H.T.: Sparse hashing for fast multimedia search. ACM Trans. Inf. Syst. (TOIS) 31(2), 9 (2013)
Zhu, X., Huang, Z., Yang, Y., Shen, H.T., Xu, C., Luo, J.: Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recogn. 46(1), 215–229 (2013)
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. 46(2), 450–461 (2016)
Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans. Neural Netw. Learn. Syst., 1–13 (2016)
Zhu, X., Wu, X., Ding, W., Zhang, S.: Feature selection by joint graph sparse coding. In: Proceedings of the 2013 Siam International Conference on Data Mining, pp. 803–811. SIAM (2013)
Zhu, X., Zhang, J., Zhang, S.: Mixed-norm regression for visual classification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013. LNCS (LNAI), vol. 8346, pp. 265–276. Springer, Heidelberg (2013). doi:10.1007/978-3-642-53914-5_23
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Zhu, X., Zhang, S., Zhang, J., Zhang, C.: Cost-sensitive imputing missing values with ordering. AAAI Press 2, 1922–1923 (2007)
Zhu, Y., Lucey, S.: Convolutional sparse coding for trajectory reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 529–540 (2015)
Acknowledgement
This work was supported in part by the China “1000-Plan” National Distinguished Professorship; the National Natural Science Foundation of China (Grants No: 61263035, 61573270, and 61672177); the China 973 Program (Grant No: 2013CB329404); the China Key Research Program (Grant No: 2016YFB1000905); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the Innovation Project of Guangxi Graduate Education (Grants No: YCSZ2016046 and YCSZ2016045); the Guangxi Higher Institutions Program of Introducing 100 High-Level Overseas Talents; the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; and the Guangxi Bagui Scholar Teams for Innovation and Research Project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Zhang, S., Yang, L., Li, Y., Luo, Y., Zhu, X. (2016). Low-Rank Feature Reduction and Sample Selection for Multi-output Regression. In: Li, J., Li, X., Wang, S., Li, J., Sheng, Q. (eds) Advanced Data Mining and Applications. ADMA 2016. Lecture Notes in Computer Science(), vol 10086. Springer, Cham. https://doi.org/10.1007/978-3-319-49586-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-49586-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49585-9
Online ISBN: 978-3-319-49586-6
eBook Packages: Computer ScienceComputer Science (R0)