Skip to main content

A Study of Deterioration in Classification Models in Real-Time Big Data Environment

  • Conference paper
  • First Online:
Emerging Trends in Intelligent Computing and Informatics (IRICT 2019)

Abstract

Big Data (BD) is participating in the current computing revolutions. Industries and organizations are utilizing their insights for Business Intelligence (BI). BD and Artificial Intelligence are one of the fundamental pillars of Industrial Revolution (IR) 4.0. IR 4.0 demands real time BD analytic for prediction and classification. Due to complex characteristics of BD (5 V’s), BD analytics is considered a difficult task in offline mood. However, in real time or online mood, BD analytic become more challenging and requires Online Classification Models. In real time mood, the nature of input streams (input data) and target classes (output class) are dependent and non-identically distributed, which cause deterioration in OCM. Therefore, it is necessary to identify and mitigate the causes of this deterioration in OCM and improve OCM performance in RTBDE. This study investigates some fundamental causes of deterioration of Online Classification Models and discusses some possible mitigation approaches. This study also presents some experimental results to show the deterioration in OCM due to real time big data environment. In the future, this study will propose a method to mitigate deterioration in Online Classification Models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jin, X., et al.: Significance and challenges of big data research. Big Data Res. 2(2), 59–64 (2015)

    Article  Google Scholar 

  2. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  3. L’Heureux, A., Grolinger, K., Elyamany, H.F., Capretz, M.A.: Machine learning with big data: challenges and approaches. IEEE Access 5, 7776–7797 (2017)

    Article  Google Scholar 

  4. Zhang, Q., et al.: An adaptive droupout deep computation model for industrial IoT big data learning with crowdsourcing to cloud computing. IEEE Trans. Ind. Inf. 15(4), 2330–2337 (2018)

    Article  Google Scholar 

  5. Jameel, S.M., et al.: A fully adaptive image classification approach for industrial revolution 4.0. In: International Conference of Reliable Information and Communication Technology, vol. 843, pp. 311–321. Springer, Cham (2018)

    Google Scholar 

  6. Sun, Y., Tang, K., Zhu, Z., Yao, X.: Concept drift adaptation by exploiting historical knowledge. IEEE Trans. Neural Netw. Learn. Syst. 29(10), 4822–4832 (2017)

    Article  Google Scholar 

  7. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)

    Article  Google Scholar 

  8. Kononenko, I.: Semi-naive Bayesian classifier. In: European Working Session on Learning, vol. 482, pp. 206–219. Springer, Heidelberg (1991)

    Google Scholar 

  9. Ng, W.L.: A simple classifier for multiple criteria ABC analysis. Eur. J. Oper. Res. 177(1), 344–353 (2007)

    Article  Google Scholar 

  10. Veloso, A., Meira Jr., W., Zaki, M.J.: Lazy associative classification. In: Sixth International Conference on Data Mining (ICDM 2006), pp. 645–654. IEEE (2006)

    Google Scholar 

  11. Wang, Q., et al.: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73(16), 5261–5267 (2007)

    Article  Google Scholar 

  12. Jain, V., Phophalia, A., Bhatt, J.S.: Investigation of a joint splitting criteria for decision tree classifier use of information gain and gini index. In: 2018 IEEE Region 10 Conference (TENCON 2018). IEEE (2018)

    Google Scholar 

  13. Bertini Junior, J.R., do Carmo Nicoletti, M.: An iterative boosting-based ensemble for streaming data classification. Inf. Fusion 45, 66–78 (2019)

    Article  Google Scholar 

  14. Sayed-Mouchaweh, M. (ed.): Learning from Data Streams in Evolving Environments: Methods and Applications, vol. 41. Springer, Heidelberg (2018)

    MATH  Google Scholar 

  15. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

Download references

Acknowledgment

This research study is a part of the funded project under a matching grant scheme supported by University Technology Petronas (UTP), Malaysia and Hamdard University, Pakistan. This is the second phase, which focuses on problem formulation, preliminary simulations, and identification of possible problem mitigation approaches.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Syed Muslim Jameel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Uddin, V., Rizvi, S.S.H., Hashmani, M.A., Jameel, S.M., Ansari, T. (2020). A Study of Deterioration in Classification Models in Real-Time Big Data Environment. In: Saeed, F., Mohammed, F., Gazem, N. (eds) Emerging Trends in Intelligent Computing and Informatics. IRICT 2019. Advances in Intelligent Systems and Computing, vol 1073. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_8

Download citation

Publish with us

Policies and ethics