Next Article in Journal
Characteristics of Unemployed People, Training Attendance and Job Searching Success in the Valencian Region (Spain)
Next Article in Special Issue
An Evaluation of the Information Technology of Gene Expression Profiles Processing Stability for Different Levels of Noise Components
Previous Article in Journal
Improving the Quality of Survey Data Documentation: A Total Survey Error Perspective
Previous Article in Special Issue
Short-Term Forecasting of Electricity Supply and Demand by Using the Wavelet-PSO-NNs-SO Technique for Searching in Big Data of Iran’s Electricity Market
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development of the Non-Iterative Supervised Learning Predictor Based on the Ito Decomposition and SGTM Neural-Like Structure for Managing Medical Insurance Costs †

1
Department of Publishing Information Technologies, Lviv Polytechnic National University, 79000 Lviv, Ukraine
2
Department of Automated Control Systems, Lviv Polytechnic National University, 79000 Lviv, Ukraine
*
Author to whom correspondence should be addressed.
This paper is an extended version of the paper: Vitynskyi, P.; Tkachenko, R.; Izonin, I.; Kutucu, H. Hybridization of the SGTM Neural-like Structure through Inputs Polynomial Extension. In Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 21–25 August 2018; Lviv Polytechnic Publishing House: Lviv, Ukraine, 2018; pp. 386–391.
Submission received: 23 September 2018 / Revised: 24 October 2018 / Accepted: 29 October 2018 / Published: 31 October 2018
(This article belongs to the Special Issue Data Stream Mining and Processing)

Abstract

:
The paper describes a new non-iterative linear supervised learning predictor. It is based on the use of Ito decomposition and the neural-like structure of the successive geometric transformations model (SGTM). Ito decomposition (Kolmogorov–Gabor polynomial) is used to extend the inputs of the SGTM neural-like structure. This provides high approximation properties for solving various tasks. The search for the coefficients of this polynomial is carried out using the fast, non-iterative training algorithm of the SGTM linear neural-like structure. The developed method provides high speed and increased generalization properties. The simulation of the developed method’s work for solving the medical insurance costs prediction task showed a significant increase in accuracy compared with existing methods (common SGTM neural-like structure, multilayer perceptron, Support Vector Machine, adaptive boosting, linear regression). Given the above, the developed method can be used to process large amounts of data from a variety of industries (medicine, materials science, economics, etc.) to improve the accuracy and speed of their processing.

1. Introduction

Health insurance is one of the main directions of modern healthcare system development [1,2]. The prediction of individual health insurance costs is one of the most important tasks in this direction. The application of commonly used regression methods [3] does not provide satisfactory results in solving this task. In the big data era, the problem is deepened by the need for accurate and quick operation of such methods [4,5].
The existence of a large number of data leads to the possibility of using artificial intelligence to solve this task. The use of computational intelligence will allow for the hidden dependencies in the data set to be taken into account [6]. In most cases, it can increase the accuracy of individual health insurance costs prediction.
Existing neural network tools [7,8] demonstrate a sufficient accuracy of their work. However, they do not always provide the satisfactory speed of training procedures. The use of multilayer perceptron [9] for processing large amounts of data necessitates the use of large volumes of memory. In addition, this tool does not always provide satisfactory generalization properties [10]. The main drawback of the RBF networks for solving this task is that they provide only a local approximation of the nonlinear response surface [10]. Moreover, this method is characterized by a “curse of dimension”, which imposes a number of restrictions on its use for the processing of large amounts of data [11]. In the works of [12,13], the backpropagation algorithm is used to implement the training procedure. The large numbers of the epochs of this algorithm, as well as a large amount of input data, cause large time delays during its use.
Deep learning methods are associated with large time delays for training, long-term debugging procedures, and the need to interpret the output signals of each hidden layer. They are designed primarily for image processing tasks [14].
The training procedures of the known machine learning algorithms are fast [15]; however, these methods are inferior to the accuracy of the prediction results [16].
That is why it is necessary to develop new or improve existing individual insurance costs prediction methods and tools that would provide high prediction accuracy with sufficient training speed.

2. Data Description

2.1. Data Analysis

To solve the regression task, the medical insurance cost prediction dataset (Dataset: https://www.kaggle.com/mirichoi0218/insurance. Dataset License: Open Database) was selected from Kaggle [17]. It contains 1338 observations of the personal medical insurance cost. Each vector includes six input attributes and one output (Table 1). The task is to predict individual costs for health insurance.
We will consider all independent variables in more detail:
  • Insurance contractor age (Age). The minimal age of the insurance contractor is 18 years, maximum is 64 years, and the average age of the entire sample is 39.2 years. Insurance contractor age included 574 young insurance contractors (18–35), 548 senior insurance contractors (35–55), and 216 elder insurance contractors (>55).
  • Body mass index (BMI). This is the ratio of the person’s height to their weight (kg/m2). Minimal BMI is 15.96, the maximum is 53.12, and the average is 30.66. It is higher than normal.
  • The number of dependents (Children). This is the number of children covered by medical insurance. This indicator ranges from 1 to 5, and the average is 1095.
  • Smoking (Smoker). The dataset contains 1064 smokers and 274 non-smokers.
  • Beneficiary’s residential area in the United States (Area). This column displays four regions of the United States, where the number of observations for each of them is northeast: 324, northwest: 325, southeast: 364, and southwest: 325.
The individual insurance costs (IIC) is an output variable.

2.2. Data Preparation

We will make a series of transformations of the input data in order to represent them from a text in a numerical form (binary system coding). In particular, we will add five new columns as follows: Each column of the Insurance contractor gender and Smoking will turn into two; namely, male (M) and female (F), and smoker and non-smoker, respectively. The column Beneficiary’s residential area in the United States will be transformed into four different ones, each of which will be located in one of the four U.S. regions: Area 1 is Southwest, Area 2 is Southeast, Area 3 is Northwest, and Area 4 is Northeast. Thus, a new data sample was obtained. The vectors of each of the 1338 observations contain 11 input numeric attributes. They are given in Table 2.
Figure 1 shows the scatter plot of the dataset from Table 2 using Orange Software, version 3.13.0. [18].

3. Predictor Based on the Ito Decomposition and Neural-Like Structure of the Successive Geometric Transformations Model (SGTM)

This paper proposes a new method focused on high-speed realization and universal application for regression and classification tasks.

3.1. Linear Neural-Like Structure of the Successive Geometric Transformations Model

The authors of [19] have described the topology and the training algorithm of the new non-iterative neural-like structure for solving various tasks. It is based on the successive geometric transformations model (SGTM), and can work in supervised and unsupervised modes. The topology of this linear computational intelligence tool is demonstrated in Figure 2. Its feature is ordered as lateral connections between adjacent neurons of the hidden layer. The procedures of training and functioning of this instrument are of the same type.
The greedy non-iterative training algorithm ensures the repetition of the solution and allows using the common SGTM neural-like structure for processing large amounts of data effectively. Detailed mathematical descriptions and flowcharts of the training and operation procedures of the common SGTM neural-like structure are given in the work of [20].

3.2. The Ito Decomposition

The accuracy of the approximation task for nonlinear dependencies is one of the important tasks for processing large amounts of data. Existing machine learning methods do not always provide an opportunity for their use to obtain sufficiently precise results for solving this task.
According to the Weierstrass theorem, any continuous function in the given interval can be arbitrarily precisely described by a series of polynomials [21]. Another mathematical proof of the approximation of any continuous function is the universal approximation theorem (the expansion of the Weierstrass theorem).
The Ito decomposition (Kolmogorov–Gabor polynomial) is widely used for the development of various nonlinear approximation models [22,23,24,25,26]. The general view of the second degree polynomial can be written as follows [6]:
Y ( x 1 , , x n ) = a i + i = 1 n a i x i + i = 1 n j = i n a i , j x i x j
Under the conditions of processing of large amounts of multiparametric data, the searching of the polynomial’s coefficients is a non-trivial task. Existing methods, in particular, the least squares method and singular decomposition, do not provide sufficient speed [6]. That is why the application of the Kolmogorov–Gabor polynomial for the elaboration of the big data processing models requires the development of new, more efficient algorithms for the searching of its coefficients.

3.3. The Composition of the Non-Iterative Supervised Learning Predictor Using Ito Decomposition

The proposed linear non-iterative prediction method is based on combining the use of the Ito decomposition (Kolmogorov–Gabor polynomial) and SGTM neural-like structure [6]. Input (dependent) parameters according to the method are represented as members of this polynomial. The SGTM neural-like structure is used to find the Kolmogorov–Gabor polynomial’s coefficients. The benefits of such process are fast training, as well as the repetition of the solution. Figure 3 demonstrates the topology of the proposed non-iterative neural-like predictor, which contains two blocks [6].
The input data is converted in the first block (preprocessing) (1). The number of input layer’s neurons of the proposed method when choosing a second-degree polynomial can be calculated according to the following formula [6]:
m = n + n ( n + 1 ) 2 ,
where n is the number of initial inputs from Table 1 ( n = 11 )
As the result of the fast, non-iterative training, the coefficients of the Kolmogorov–Gabor member are calculated in the hidden layer of the proposed model’s second block (Figure 3). Then, they are used to solve the task [6].

4. Modelling and Results

The simulation of the proposed method was carried out using the author’s software (console application). The main parameters of the computer on which the experiments were carried out are as follows: memory: 8 Gb Intel® Core(TM) i5-6200U CPU, 2.40 GHz.
The parameters of the proposed method (SGTM + Ito decomposition) are as follows: 77 neurons in the input and hidden layers, 1 output. The second-degree Kolmogorov–Gabor polynomial was chosen for modeling. The mean absolute percentage error (MAPE) for the proposed method was 30.82%.
The mathematical basis for the direct dissemination networks application with the one hidden layer to the solution of approximation tasks is the universal approximation theorem. According to the theorem, the accuracy of the best approximation is obtained with a large number of neurons in the hidden layer [27]. However, in this case, according to the authors of [28], there is a possibility of overfitting.
A necessary estimation of the proposed method’s work is the indicator of the model’s complexity ratio to the accuracy of its work [24,29]. The complexity of the model, in this case, is influenced by two parameters; namely, the degree of the Kolmogorov–Gabor polynomial (which is why the second degree polynomial has been chosen) and the number of hidden layer’s neurons of the SGTM linear neural-like structure. The conducted experimental studies have demonstrated that the number of the polynomial’s coefficients, which are formed in a hidden layer, make a very small contribution to obtaining the exact result. However, their calculation greatly increases the duration of the method. That is why the research was conducted in order to determine the optimal complexity model for the proposed method. The results of this experiment are listed in Appendix A, Table A1.
Figure 4a demonstrates the ratio of the neurons number in the hidden layer of the proposed method to the accuracy of the work (MAPE) on the interval 25–50 neurons with a step of 5. As can be seen from Figure 4, the optimal result of the method is 35 neurons in the hidden layer. All other indicators in Appendix A, Table A1 demonstrate the same result. Figure 4b confirms the obtained result regarding the duration of the training procedure.
The training procedure for the optimized version of the method is much shorter than training of the proposed method without optimal parameters selection. In addition, by reducing the number of neurons in the hidden layer from 77 to 35, it was possible to neutralize the effect of noise components. The topology of the optimized version of the method is demonstrated in Figure 5.
Table 3 provides quantitative indicators for evaluating the work of the developed method and its optimized version in terms of both training and testing modes according to the following indicators [30,31]:
  • Mean absolute percentage error (MAPE);
  • Sum square error (SSE);
  • Symmetric mean absolute percentage error (SMAPE);
  • Root mean square error (RMSE);
  • Mean absolute error (MAE).
As can be seen from the table, the optimal parameters selection of the proposed method (according to all five indicators) allowed the following:
  • to increase the generalization properties of the method (the difference between the MAPE indicators in the training and testing modes is 2.40% and 1.12%, respectively, for the developed and optimized methods);
  • to increase the accuracy of the optimized method by 1.34%.
In addition, it was possible to reduce the duration of the training procedure by 0.22 s. In terms of big data processing, all of the above are significant advantages.

5. Comparison and Discussion

The results of the developed method (optimize version) were compared with the results of the known methods [6], which are demonstrated in Figure 6.
Figure 6 demonstrates the training and testing errors for all methods. As can be seen from the figure, the common SGTM neural-like structure provides the lowest error value of the regression task among all known methods. However, the use of Ito decomposition can significantly improve the accuracy of the method in both modes of operation by 1.5 and 1.3 times, respectively. This is because the linear non-iterative SGTM neural-like structure provides an exact search for the coefficients of the Kolmogorov–Gabor polynomial. In addition, reducing the number of hidden layer’s neurons allows discarding components (members of a polynomial) that do not affect the result. In this way, an effective approximation procedure with great accuracy is carried out.
An important role in applying the computational intelligence methods for solving the practical tasks of processing large data arrays is played an important role for the duration of the training procedure. That is why in this work, the comparison of the training procedure duration for all considered methods is given. Figure 7 demonstrates the results of this investigation.
As can be seen from Figure 7, the multi-layered perceptron demonstrates the longest training time. The linear common SGTM neural-like structure provides one of the best results and is inferior only to linear regression. However, the latter method demonstrates poor results in accuracy (Figure 7). The developed method demonstrates 10 times faster training compared with multi-layer perceptron and less than 8 times slower training compared with the common SGTM neural-like structure. Obviously, the working time of the developed method has increased, as the dimension of the input space due to the use of Ito decomposition in accordance to equation (2) has significantly increased. However, the developed method demonstrates the best results both in the accuracy of work and in relation to the generalization properties of the chosen instrument of computational intelligence.
Figure 8 illustrates the visualization of the work for all investigated methods in the form of scatter plots. Figure 8f confirms the best results of the developed method among those considered as to the accuracy of its work.
This approach can be used for solving different task in Service Science area [32,33].

6. Conclusions

The authors describe the new developed non-iterative computational intelligence tool for solving the regression task in this paper. It combines the Ito decomposition and the neural-like structure of the successive geometric transformations model. The simulation was conducted to solve the individual medical cost prediction task. The effectiveness of the proposed tool is confirmed by comparing its work with existing predictors. The precision of the developed predictor shows the highest values based on five indicators: MAPE, SSE, SMAPE, RMSE, and MAE, in training and testing modes. In addition, a comparison between the training procedure duration of the developed tool with that of the existing ones was made. It demonstrates the satisfactory results of the experiment given the significant increase in the dimensions of the input data (from 11 to 77 input characteristics).
Based on the above, we can distinguish the following advantages of the developed predictor:
the quick, non-iterative training procedure;
the increase of generalization properties;
the significant increase in the prediction accuracy.
All these advantages give grounds to assert about the possibility of using the proposed instrument for solving the regression task in various fields, under conditions of both large and small data samples.
Further work will be conducted in the direction of the applying of the higher order’s Ito decomposition. Such an approach shall be effective by replacing the primary inputs of the task with the principal components, and by discarding the principal components with small variance values. The SGTM neural-like structure gives the fastest solution for obtaining the values that are very close to the principal components of each input vector.

Author Contributions

Conceptualization, R.T.; methodology, I.I.; software, P.V.; validation, R.T. and O.P.; formal analysis, R.T.; investigation, I.I.; writing—original draft preparation, I.I.; writing—review and editing R.T.; visualization, I.I. and N.L.; supervision, R.T.

Funding

This research received no external funding.

Acknowledgments

The authors thank the organizers of the DSMP’2018 conference for the opportunity to publish the article, as well as reviewers for the relevant comments that helped to better present the paper’s material.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Training and testing errors of the developed method’s results when changing the number of neurons in the hidden layer: Mean Absolute Percentage Error (MAPE), Sum Square Error (SSE), Symmetric Mean Absolute Percentage Error (SMAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE).
Table A1. Training and testing errors of the developed method’s results when changing the number of neurons in the hidden layer: Mean Absolute Percentage Error (MAPE), Sum Square Error (SSE), Symmetric Mean Absolute Percentage Error (SMAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE).
Observation/IndicatorTraining ErrorsTest Errors
Hidden Layer’s Neurons NumberTraining TimeMAPESSESMAPERMSEMAEMAPESSESMAPERMSEMAE
50.032898.0907250,153.16430.22777647.40546017.245790.4530122,496.04300.21347482.63685799.5606
100.064029.6955199,827.02080.14636108.89033866.168730.6628100,543.35250.15116141.66284033.6993
150.077436.0616192,268.20870.14665877.81063873.742636.654696,848.99240.14865915.99383947.6894
200.094036.1080191,597.26580.14665857.29933875.055136.379495,882.45450.14845856.95313937.6843
250.123137.2211188,909.53830.14545775.13313842.248238.804495,367.03150.14835825.46863948.9449
300.134227.6548156,156.16660.10854773.83332867.161429.084583,439.07890.11905096.85303185.9151
350.147627.0794153,613.49980.10534696.10172783.143328.203382,337.15650.11495029.54233077.6321
400.178027.6964152,671.69230.10534667.30982783.858330.094382,309.93180.11585027.87933103.5243
450.200027.8937152,665.27440.10544667.11362786.030330.303582,387.05730.11615032.59053111.7831
500.207928.5096152,568.59490.10544664.15802785.557930.651982,189.39160.11595020.51623107.0487
550.271928.5293152,511.96800.10534662.42692783.573930.945482,394.35760.11625033.03643112.8836
600.287028.6878152,374.28040.10534658.21772784.807530.612682,389.56430.11555032.74363092.6906
650.315928.6454152,239.36910.10514654.09332779.086730.723082,638.15760.11585047.92893098.5853
700.353428.5060151,938.27820.10494644.88872774.015630.956882,804.67600.11615058.10063105.5370
750.353728.3622151,865.91380.10484642.67652769.159030.969982,795.91300.11605057.56533103.7871
770.363228.4187151,771.17170.10474639.78012767.607530.823082,676.26200.11585050.25653099.7172

References

  1. Melnykova, N.; Shakhovska, N.; Sviridova, T. The personalized approach in a medical decentralized diagnostic and treatment. In Proceedings of the 14th International Conference: The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM), Lviv, Ukraine, 21–25 February 2017; Lviv Publishing House: Lviv, Ukraine, 2017; pp. 295–297. [Google Scholar]
  2. Babichev, S.; Kornelyuk, A.; Lytvynenko, V.; Osypenko, V. Computational analysis of microarray gene expression profiles of lung cancer. Biopolym. Cell 2016, 32, 70–79. [Google Scholar] [CrossRef]
  3. Medical Cost Personal Datasets. Insurance Forecast by Using Linear Regression. Available online: https://www.kaggle.com/mirichoi0218/insurance/kernels (accessed on 20 September 2018).
  4. Bodyanskiy, Y.; Vynokurova, O.; Pliss, I.; Setlak, G.; Mulesa, P. Fast learning algorithm for deep evolving GMDH-SVM neural network in data stream mining tasks. In Proceedings of the 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 21–25 August 2016; Lviv Publishing House: Lviv, Ukraine, 2016; pp. 257–262. [Google Scholar] [CrossRef]
  5. Veres, O.; Shakhovska, N. Elements of the formal model big date. In Proceedings of the XI International Conference on Perspective Technologies and Methods in MEMS Design (MEMSTECH), Lviv, Ukraine, 2–6 September 2015; Lviv Publishing House: Lviv, Ukraine, 2015; pp. 81–83. [Google Scholar] [CrossRef]
  6. Vitynskyi, P.; Tkachenko, R.; Izonin, I.; Kutucu, H. Hybridization of the SGTM Neural-like Structure through Inputs Polynomial Extension. In Proceedings of the 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 21–25 August 2018; Lviv Polytechnic Publishing House: Lviv, Ukraine, 2018; pp. 386–391. [Google Scholar]
  7. Setlak, G.; Bodyanskiy, Y.; Vynokurova, O.; Pliss, I. Deep evolving GMDH-SVM-neural network and its learning for Data Mining tasks. In Proceedings of the 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Gdansk, Poland, 11–14 September 2016; pp. 141–145. [Google Scholar]
  8. Hu, Z.; Bodyanskiy, Y.V.; Tyshchenko, O.K. A Cascade Deep Neuro-Fuzzy System for High-Dimensional Online Possibilistic Fuzzy Clustering. In Proceedings of the XI-th International Scientific and Technical Conference “Computer Science and Information Technologies” (CSIT 2016), Lviv, Ukraine, 6–10 September 2016; Lviv Polytechnic Publishing House: Lviv, Ukraine, 2016; pp. 119–122. [Google Scholar]
  9. Ganovska, B.; Molitoris, M.; Hosovsky, A.; Pitel, J.; Krolczyk, J.B. Design of the model for the on-line control of the AWJ technology based on neural networks. Indian J. Eng. Mater. Sci. 2016, 23, 279–287. [Google Scholar]
  10. Gajewski, J.; Valis, D. The determination of combustion engine condition and reliability using oil analysis by MLP and RBF neural networks. Tribol. Int. 2017, 115, 557–572. [Google Scholar] [CrossRef]
  11. Hu, Z.; Jotsov, V.; Jun, S.; Kochan, O.; Mykyichuk, M.; Kochan, R.; Sasiuk, T. Data science applications to improve accuracy of thermocouples. In Proceedings of the 2016 IEEE 8th International Conference on Intelligent Systems (IS), Sofia, Bulgaria, 4–6 September 2016; pp. 180–188. [Google Scholar] [CrossRef]
  12. Koprowski, R.; Lanza, M.; Irregolare, C. Corneal power evaluation after myopic corneal refractive surgery using artificial neural networks. BioMed. Eng. Online 2016, 15. [Google Scholar] [CrossRef] [PubMed]
  13. Glowacz, A. Acoustic based fault diagnosis of three-phase induction motor. Appl. Acoust. 2018, 137, 82–89. [Google Scholar] [CrossRef]
  14. Deng, L.; Yu, D. Deep Learning: Methods and Applications. Found. Trends Signal Process. 2014, 7, 1–199. [Google Scholar] [CrossRef]
  15. Bellazzia, R.; Zupan, B. Predictive data mining in clinical medicine: Current issues and guidelines. Int. J. Med. Inform. 2008, 77, 81–97. [Google Scholar] [CrossRef] [PubMed]
  16. Caruana, R. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 161–168. [Google Scholar]
  17. Medical Cost Personal Datasets. Available online: https://www.kaggle.com/mirichoi0218/insurance (accessed on 9 September 2018).
  18. Demsar, J.; Curk, T.; Erjavec, A.; Gorup, C.; Hocevar, T.; Milutinovic, M.; Mozina, M.; Polajnar, M.; Toplak, M.; Staric, A.; et al. Orange: Data Mining Toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
  19. Tkachenko, R.; Izonin, I. Model and Principles for the Implementation of Neural-Like Structures based on Geometric Data Transformations. In Advances in Computer Science for Engineering and Education; Advances in Intelligent Systems and Computing; Hu, Z.B., Petoukhov, S., Eds.; Springer: Cham, Swizland, 2018; Volume 754, pp. 578–587. [Google Scholar] [CrossRef]
  20. Tkachenko, R.; Tkachenko, P.; Izonin, I.; Tsymbal, Y. Learning-Based Image Scaling Using Neural-Like Structure of Geometric Transformation Paradigm. In Advances in Soft Computing and Machine Learning in Image Processing; Hassanien, A., Oliva, D., Eds.; Studies in Computational Intelligence; Springer: Cham, Swizland, 2018; Volume 730, pp. 537–567. [Google Scholar] [CrossRef]
  21. Weierstrass, K. Uber Die Analytische Darstellbarkeit Sogenannter Willkurlicher Funktionen einer Reellen Veranderlichen; Sitzungsberichte der Akademie der Wissenschaften: Berlin, Germany, 1885; pp. 633–639. [Google Scholar]
  22. Iba, H.; Sato, T. Meta-level Strategy for Genetic Algorithms Based on Structured Representations. In Proceedings of the Second Pacific Rim International Conference on Artificial Intelligence, Seoul, Korea, 15–18 September 1992; pp. 548–554. [Google Scholar]
  23. Iba, H.; Sato, T.; Garis, H. System Identification Approach to Genetic Programming. In Proceedings of the First IEEE Conference on Evolutionary Computation, Orlando, FL, USA, 27–29 June 1994; Volume I, pp. 401–406. [Google Scholar]
  24. Ivakhnenko, A.G. Polynomial Theory of Complex Systems. IEEE Trans. Syst. Man Cybern. 1971, 1, 364–378. [Google Scholar] [CrossRef]
  25. Kargupta, H.; Smith, R.E. System Identification with Evolving Polynomial Networks. In Proceedings of the Fourth International Conference on Genetic Algorithms, San Mateo, CA, USA, 14–17 September 1991; pp. 370–376. [Google Scholar]
  26. Nikolaev, N.I.; Iba, H. Accelerated Genetic Programming of Polynomials. Genet. Program. Evol. Mach. 2001, 2, 231–257. [Google Scholar] [CrossRef]
  27. Haykin, S. Neural Networks and Learning Machines, 3rd ed.; Pearson Education, Inc.: London, UK, 2009; p. 904. [Google Scholar]
  28. Ng, A. Machine Learning. Lecture 7.1 Regularization. The Problem of Overfitting. Available online: https://www.youtube.com/watch?v=u73PU6Qwl1I (accessed on 9 September 2018).
  29. Korobchynskyi, M.V.; Chyrun, L.B.; Vysotska, V.A.; Nych, M.O. Matches prognostication features and perspectives in cybersport. Radio Electron. Comput. Sci. Control 2017, 95–105. [Google Scholar] [CrossRef]
  30. Botchkarev, A. Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology. 2018. Available online: https://arxiv.org/abs/1809.03006 (accessed on 9 September 2018).
  31. Chen, C.; Twycross, J.; Garibaldi, J.M. A new accuracy measure based on bounded relative error for time series forecasting. PLoS ONE 2017, 12, 54–67. [Google Scholar] [CrossRef] [PubMed]
  32. Molnár, E.; Molnár, R.; Kryvinska, N.; Greguš, M. Web Intelligence in practice. J. Serv. Sci. Res. 2017, 6, 149–172. [Google Scholar] [CrossRef]
  33. Kryvinska, N. Building Consistent Formal Specification for the Service Enterprise Agility Foundation. J. Serv. Sci. Res. 2012, 4, 235–269. [Google Scholar] [CrossRef]
Figure 1. Dataset visualization using Orange Software, version 3.13.0. The x-axis represents the insurance contractor’s age, and the y-axis is the size of the medical insurance costs. The circles mark women, and the crosses mark men. The blue circles mark the woman-non-smoker, and the red circles mark the woman-smoker. The blue crosses mark the male-smoker, the red crosses mark the male non-smoker. The size of the figures reflects the value of the body mass index. The larger index, the larger shape of the corresponding figure. Shapes and colors were chosen randomly.
Figure 1. Dataset visualization using Orange Software, version 3.13.0. The x-axis represents the insurance contractor’s age, and the y-axis is the size of the medical insurance costs. The circles mark women, and the crosses mark men. The blue circles mark the woman-non-smoker, and the red circles mark the woman-smoker. The blue crosses mark the male-smoker, the red crosses mark the male non-smoker. The size of the figures reflects the value of the body mass index. The larger index, the larger shape of the corresponding figure. Shapes and colors were chosen randomly.
Data 03 00046 g001
Figure 2. Topology of the common linear neural-like structure of the successive geometric transformations model (SGTM).
Figure 2. Topology of the common linear neural-like structure of the successive geometric transformations model (SGTM).
Data 03 00046 g002
Figure 3. Topology of the proposed model (combining the use of the Ito decomposition and SGTM linear neural-like structure).
Figure 3. Topology of the proposed model (combining the use of the Ito decomposition and SGTM linear neural-like structure).
Data 03 00046 g003
Figure 4. Optimal parameters identification for training and test modes: (a) mean absolute percentage error (MAPE) value according to changing the hidden layer’s neurons number, (b) MAPE value according to changing the training time.
Figure 4. Optimal parameters identification for training and test modes: (a) mean absolute percentage error (MAPE) value according to changing the hidden layer’s neurons number, (b) MAPE value according to changing the training time.
Data 03 00046 g004
Figure 5. Topology of the proposed optimized model.
Figure 5. Topology of the proposed optimized model.
Data 03 00046 g005
Figure 6. Mean absolute error (MAE) in training and testing modes for developed and existing methods. On the x-axis, the mean absolute error value for all considered methods is illustrated.
Figure 6. Mean absolute error (MAE) in training and testing modes for developed and existing methods. On the x-axis, the mean absolute error value for all considered methods is illustrated.
Data 03 00046 g006
Figure 7. The training time for all methods.
Figure 7. The training time for all methods.
Data 03 00046 g007
Figure 8. Visualization of the methods’ results. On the x-axis, the real values of insurance medical costs are given, and on the y-axis are values obtained by one of the methods: (a) support vector regression with RBF kernel, (b) linear regression, (c) adaptive boosting, (d) multilayer perceptron, (e) linear common SGTM neural-like structure, or (f) proposed method.
Figure 8. Visualization of the methods’ results. On the x-axis, the real values of insurance medical costs are given, and on the y-axis are values obtained by one of the methods: (a) support vector regression with RBF kernel, (b) linear regression, (c) adaptive boosting, (d) multilayer perceptron, (e) linear common SGTM neural-like structure, or (f) proposed method.
Data 03 00046 g008aData 03 00046 g008b
Table 1. Original dataset.
Table 1. Original dataset.
#Insurance Contractor AgeInsurance Contractor GenderBody Mass Index, kg/m2Number of DependentsSmokingBeneficiary’s Residential Area in the United StatesIndividual Insurance Costs
119female27.90yessouthwest16,884.924
218male33.771nosoutheast1725.5523
328male333nosoutheast4449.462
433male22.7050nonorthwest21,984.4706
532male28.880nonorthwest3866.8552
631female25.740nonortheast3756.6216
i60female27.90yessouthwest16,884.924
n61female29.070yesnorthwest29,141.3603
Table 2. Prepared Dataset. F—female; M—male; BMI—body mass index; ICC—individual insurance costs.
Table 2. Prepared Dataset. F—female; M—male; BMI—body mass index; ICC—individual insurance costs.
#AgeFMBMI, kg/m2ChildrenSmokerNon-SmokerArea 1Area 2Area 3Area 4IIC
1191027.9010100016,884.924
2180133.7710101001725.5523
328013330101004449.462
4330122.705001001021,984.471
5320128.8800100013866.8552
6311025.7400100103756.6216
i601027.9010100016,884.924
n610029.07010001029,141.360
Table 3. Modeling results based on mean absolute percentage error (MAPE), sum square error (SSE), symmetric mean absolute percentage error (SMAPE), root mean square error (RMSE), and mean absolute error (MAE) in training and test modes. SGTM—successive geometric transformations model.
Table 3. Modeling results based on mean absolute percentage error (MAPE), sum square error (SSE), symmetric mean absolute percentage error (SMAPE), root mean square error (RMSE), and mean absolute error (MAE) in training and test modes. SGTM—successive geometric transformations model.
Method/IndicatorMAPESSESMAPERMSEMAE
Training errors
Proposed model (SGTM + Ito decomposition)28.4187151,7710.10474639.782767.61
Proposed model with optimal parameters27.0794153,6130.105314696.12783.14
Test errors
Proposed model (SGTM + Ito decomposition)30,82382,6760.11585050.33099.7
Proposed model with optimal parameters28,20382,3370.11495029.53077.6

Share and Cite

MDPI and ACS Style

Tkachenko, R.; Izonin, I.; Vitynskyi, P.; Lotoshynska, N.; Pavlyuk, O. Development of the Non-Iterative Supervised Learning Predictor Based on the Ito Decomposition and SGTM Neural-Like Structure for Managing Medical Insurance Costs. Data 2018, 3, 46. https://doi.org/10.3390/data3040046

AMA Style

Tkachenko R, Izonin I, Vitynskyi P, Lotoshynska N, Pavlyuk O. Development of the Non-Iterative Supervised Learning Predictor Based on the Ito Decomposition and SGTM Neural-Like Structure for Managing Medical Insurance Costs. Data. 2018; 3(4):46. https://doi.org/10.3390/data3040046

Chicago/Turabian Style

Tkachenko, Roman, Ivan Izonin, Pavlo Vitynskyi, Nataliia Lotoshynska, and Olena Pavlyuk. 2018. "Development of the Non-Iterative Supervised Learning Predictor Based on the Ito Decomposition and SGTM Neural-Like Structure for Managing Medical Insurance Costs" Data 3, no. 4: 46. https://doi.org/10.3390/data3040046

Article Metrics

Back to TopTop