Abstract
The aim of this study is the prediction of death of polytraumatized patients based on epidemiological, clinical and health treatment variables by means of data-mining methods. The main problems to be addressed were high dimensionality and imbalanced data. Since the techniques usually used to deal with these drawbacks, as feature selection methods and sampling strategies respectively, did not provided satisfactory results, the aim of the study was to find out the data mining algorithms showing the best behavior in this kind of scenarios. The study was carried out with data from 497 patients diagnosed with severe trauma who were hospitalized in the Intensive Care Unit (ICU) of the University Hospital of Salamanca. The results of the study reveal the better behavior of multiclassifiers as compared with simple classifiers in contexts of high dimensionality and imbalanced datasets, without the need to resort to undersampling and oversampling strategies, which can lead to the loss of valuable data and overfitting problems respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Cooper, G.F., Herskovits, E.: A Bayesian Method for the induction of probabilistic networks from data. Machine Learning 9(3), 09–347 (1992)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the 13th International Conference on Machine Learning, pp. 148–156 (1996)
Gama, J., Brazdil, P.: Cascade Generalization. Machine Learning 41(3), 315–343 (2000)
Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Ensemble of online neural networks for non-stationary and imbalanced data streams. Neurocomputing 122(25), 535–544 (2013)
Hall, M.A.: Correlation-based Feature Selection for Machine Learning. PhD Thesis, University of Waikato, Hamilton, Nueva Zelanda (1999)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1), 10–18 (2009)
Hemmila, M.R., Jakubus, J.L., Maggio, P.M., et al.: Real money: complications and hospital costs in trauma patients. Surgery 144(2), 307–316 (2008)
Hulse, J., Khoshgoftaar, T., Napolitano, A.: Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 935–942 (2007)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Shao, Y.H., Chen, W.J., Zhang, J.J., Wang, Z., Deng, N.Y.: An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognition 47, 3158–3167 (2014)
Wolpert, D.H.: Stacked Generalization. Neural Networks 5(2), 241–259 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Moreno García, M.N., González Robledo, J., Martín González, F., Sánchez Hernández, F., Sánchez Barba, M. (2014). Machine Learning Methods for Mortality Prediction of Polytraumatized Patients in Intensive Care Units – Dealing with Imbalanced and High-Dimensional Data. In: Corchado, E., Lozano, J.A., Quintián, H., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2014. IDEAL 2014. Lecture Notes in Computer Science, vol 8669. Springer, Cham. https://doi.org/10.1007/978-3-319-10840-7_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-10840-7_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10839-1
Online ISBN: 978-3-319-10840-7
eBook Packages: Computer ScienceComputer Science (R0)