Skip to main content

Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching

  • Conference paper
  • First Online:
Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health (CyberDI 2019, CyberLife 2019)

Abstract

This paper presents a hybrid machine learning method of classifying residential requests in natural language to responsible departments that provide timely responses back to residents under the vision of digital government services in smart cities. Residential requests in natural language descriptions cover almost every aspect of a city’s daily operation. Hence the responsible departments are fine-grained to even the level of local communities. There are no specific general categories or labels for each request sample. This causes two issues for supervised classification solutions, namely (1) the request sample data is unbalanced and (2) lack of specific labels for training. To solve these issues, we investigate a hybrid machine learning method that generates meta-class labels by means of unsupervised clustering algorithms; applies two-word embedding methods with three classifiers (including two hierarchical classifiers and one residual convolutional neural network); and selects the best performing classifier as the classification result. We demonstrate our approach performing better classification tasks compared two benchmarking machine learning models, Naive Bayes classifier and a Multiple Layer Perceptron (MLP). In addition, the hierarchical classification method provides insights into the source of classification errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/OneClickDeepLearning/classificationOfResidentialRequests.

References

  1. Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Mach. Learn. 39(2–3), 103–134 (2000)

    Article  Google Scholar 

  2. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)

    Google Scholar 

  3. Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 759–766. ACM (2007)

    Google Scholar 

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  5. Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91

    Chapter  Google Scholar 

  6. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  7. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  Google Scholar 

  8. Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: misclassification cost-sensitive boosting. ICML 99, 97–105 (1999)

    Google Scholar 

  9. Hao, P.Y., Chiang, J.H., Tu, Y.K.: Hierarchically svm classification based on support vector clustering method and its application to document categorization. Expert Syst. Appl. 33(3), 627–635 (2007)

    Article  Google Scholar 

  10. Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: AISTATS 2005 - Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (2005)

    Google Scholar 

  11. Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control, Signals Syst. 2(4), 303–314 (1989). https://doi.org/10.1007/BF02551274

    Article  MathSciNet  MATH  Google Scholar 

  12. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS 2015, pp. 649–657. MIT Press, Cambridge, MA, USA (2015). http://dl.acm.org/citation.cfm?id=2969239.2969312

  13. Zhang, Y., Wallace, B.C.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. CoRR abs/1510.03820 (2015). http://arxiv.org/abs/1510.03820

  14. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint (2014). arXiv:1408.5882

  15. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. arXiv preprint (2016). arXiv:1606.01781

  16. Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inform. 3(2), 143–157 (2009)

    Article  Google Scholar 

  17. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint (2013). arXiv:1301.3781

  18. Che, W., Li, Z., Liu, T.: Ltp: a chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, COLING 2010, pp. 13–16. Association for Computational Linguistics, Stroudsburg, PA, USA (2010). http://dl.acm.org/citation.cfm?id=1944284.1944288

  19. Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)

    Article  Google Scholar 

  20. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 28(1), 11–21 (1972)

    Article  Google Scholar 

  21. Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1), 31–72 (2011). https://doi.org/10.1007/s10618-010-0175-9

    Article  MathSciNet  MATH  Google Scholar 

  22. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 7, 881–892 (2002)

    Article  Google Scholar 

  23. Bilmes, J.A., et al.: A gentle tutorial of the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Int. Comput. Sci. Inst. 4(510), 126 (1998)

    Google Scholar 

  24. Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 3, 381–396 (2002)

    Article  Google Scholar 

  25. Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: ordering points to identify the clustering structure. In: ACM SIGMOD Record, vol. 28, pp. 49–60. ACM (1999)

    Google Scholar 

  26. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  27. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, T., Sun, J., Lin, H., Liu, Y. (2019). Hybrid Machine Learning Models of Classifying Residential Requests for Smart Dispatching. In: Ning, H. (eds) Cyberspace Data and Intelligence, and Cyber-Living, Syndrome, and Health. CyberDI CyberLife 2019 2019. Communications in Computer and Information Science, vol 1137. Springer, Singapore. https://doi.org/10.1007/978-981-15-1922-2_26

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1922-2_26

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1921-5

  • Online ISBN: 978-981-15-1922-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics