Abstract
This paper presents a hybrid machine learning method of classifying residential requests in natural language to responsible departments that provide timely responses back to residents under the vision of digital government services in smart cities. Residential requests in natural language descriptions cover almost every aspect of a city’s daily operation. Hence the responsible departments are fine-grained to even the level of local communities. There are no specific general categories or labels for each request sample. This causes two issues for supervised classification solutions, namely (1) the request sample data is unbalanced and (2) lack of specific labels for training. To solve these issues, we investigate a hybrid machine learning method that generates meta-class labels by means of unsupervised clustering algorithms; applies two-word embedding methods with three classifiers (including two hierarchical classifiers and one residual convolutional neural network); and selects the best performing classifier as the classification result. We demonstrate our approach performing better classification tasks compared two benchmarking machine learning models, Naive Bayes classifier and a Multiple Layer Perceptron (MLP). In addition, the hierarchical classification method provides insights into the source of classification errors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Mach. Learn. 39(2–3), 103–134 (2000)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 759–766. ACM (2007)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. ICML 99, 97–105 (1999)
Hao, P.Y., Chiang, J.H., Tu, Y.K.: Hierarchically svm classification based on support vector clustering method and its application to document categorization. Expert Syst. Appl. 33(3), 627–635 (2007)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: AISTATS 2005 - Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (2005)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control, Signals Syst. 2(4), 303–314 (1989). https://doi.org/10.1007/BF02551274
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS 2015, pp. 649–657. MIT Press, Cambridge, MA, USA (2015). http://dl.acm.org/citation.cfm?id=2969239.2969312
Zhang, Y., Wallace, B.C.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. CoRR abs/1510.03820 (2015). http://arxiv.org/abs/1510.03820
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint (2014). arXiv:1408.5882
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. arXiv preprint (2016). arXiv:1606.01781
Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inform. 3(2), 143–157 (2009)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint (2013). arXiv:1301.3781
Che, W., Li, Z., Liu, T.: Ltp: a chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, COLING 2010, pp. 13–16. Association for Computational Linguistics, Stroudsburg, PA, USA (2010). http://dl.acm.org/citation.cfm?id=1944284.1944288
Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)
Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 11–21 (1972)
Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1), 31–72 (2011). https://doi.org/10.1007/s10618-010-0175-9
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 7, 881–892 (2002)
Bilmes, J.A., et al.: A gentle tutorial of the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Int. Comput. Sci. Inst. 4(510), 126 (1998)
Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 3, 381–396 (2002)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. In: ACM SIGMOD Record, vol. 28, pp. 49–60. ACM (1999)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, T., Sun, J., Lin, H., Liu, Y. (2019). Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching. In: Ning, H. (eds) Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health. CyberDI CyberLife 2019 2019. Communications in Computer and Information Science, vol 1137. Springer, Singapore. https://doi.org/10.1007/978-981-15-1922-2_26
Download citation
DOI: https://doi.org/10.1007/978-981-15-1922-2_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1921-5
Online ISBN: 978-981-15-1922-2
eBook Packages: Computer ScienceComputer Science (R0)