A Neural Network Analysis of Treatment Quality and Efficiency of Hospitals

Objectives: Due to the escalating healthcare expenditure and the number of hospitalizations, it is becoming increasingly important for healthcare organizations to evaluate the cost and improve the quality and efficiency of treatment. Method: We deploy neural networks to examine the strategic association between hospitalization experience and treatment results. The healthcare data for the years 2009-2012 are downloaded from the Statewide Planning and Research Cooperative System (SPARCS) of the New York State Department of Health (NYSDOH). We operationalize the hospitalization experience using the indicators facility ID, procedure description, type of admission, patient disposition upon discharge, APR severity of illness, source of payment, and age group; and the treatment result using indicators hospital length of stay and APR risk of mortality Results: Our findings show that there are significant differences in length of stay and mortality rates depending on the treatment procedure. Treatment result shows a strong association with procedure and with the patients’ disposition upon discharge. Interestingly, under similar health conditions, patients who are under the public healthcare system tend to have longer length of hospital stays than others. Conclusions: We offer a portfolio of factors to be considered in evaluating patient health outcomes from hospitalization. We emphasize the need for efficient utilization of investment in healthcare, be it public or private.


Introduction
Due to the escalating healthcare expenditure and the number of hospitalizations, it is becoming increasingly important for healthcare organizations to evaluate and improve the quality and efficiency of treatment.The past several years have seen a great demand for affordable and high-quality health care services in the United States.
In this empirical study, we examine the correlation between hospitalization composition (variables in the hospitalization experience) and treatment result (variables that indicate the outcome of hospital treatment in terms of hospital length of stay and risk of mortality), using neural networks.Using inpatient discharge records from New York State hospitals, we examine and highlight significant factors for efficient treatment outcomes.We identify factors that specifically reduce the hospital length of stay and the mortality rates for patients [1], as this is the first step to improve quality of care.Our results offer a portfolio of factors that can be evaluated for making informed healthcare decisions in selection of facility and treatment.
In our analysis, we investigate if there is an association between hospitalization and treatment result.We also identify specific hospitalization variables which are significantly associated with treatment result.
The rest of the paper is organized as follows.Section 2 gives a background of the research followed by section 3 containing the methodology.Section 4 offers a discussion of the analyses and results.Section 5 contains the scope and limitations of the research followed by section 6 with the conclusions and policy implications.Section 7 offers directions for future research.

Healthcare cost and quality
Health care provision is aimed at the dual objectives of quality care and low cost care [2].There are widespread efforts nation-wide to systematically evaluate and report on hospital quality of care.In the United States, even within the same metropolitan area, there may be a wide disparity in the prices charged for the same procedure.Oftentimes a patient ends up incurring higher hospital costs for treatment despite the availability of less expensive options in the vicinity.This is due to the lack of information transparency in hospital pricing.The situation has however changed since 2013 after President Obama introduced the healthcare reform that mandated hospital prices for common procedures to be made publicly available [3].The transparency has given patients the ability to explore various cost options in healthcare and to make informed decisions accordingly.
In addition to the disparity in the prices for the same procedure, there is disparity in the correlation between the cost and quality of care for various health conditions.While some conditions such as congestive heart failure show a positive correlation between hospital cost and quality of care, others such as pneumonia do not [4].Length of stay is an important indicator of hospital cost.Research on the association between length of stay and cost of hospital admission reveals that the incremental cost of the last day of hospital stay is 2.4% of the total cost of stay.Studies show that reducing the length of hospital stay by one day can reduce the total cost of care on average by about 3% in some cases [5].For some common diagnoses such as acute myocardial infarction and pneumonia, the risk adjusted mean hospital length of stay decreased by 2% [6].Despite the criticality of length of stay in overall cost, healthcare administration should focus not only on the length of stay but also on quality improvements in the overall delivery of care.This is especially important in the early stages of patient admission when resource consumption is generally heavy.Patient satisfaction with service is an important measure for hospitals to see if the quality of service leads to improved ratings by patients.Studies on patient perception of case manager performance for acute cardiovascular conditions (in acute care centers) have shown a positive association with patient satisfaction and a negative association with the risk for future acute care usage [7].

Neural networks and healthcare
Health analytics is used extensively in healthcare for various purposes including reducing readmission rates, reducing the length of stay, and improving patient satisfaction with care [8].Health analytics improves overall quality of patient care [9] by exploring clinical outcomes [10] and risk tolerances.The Southeast Texas Medical Associates (SETMA), by utilizing health analytics on the data, has succeeded in reduced hospital readmission rates by 22% in just six months [11].
Neural networks, as health analytic technologies have been successfully deployed in healthcare domain.Because of their ability to perform input-output mapping of data without a priori knowledge of distribution patterns of data, these are appropriate for applications that deal with large volumes of data and with fuzzy or noisy data.Other important characteristics of neural networks include learning from experience, generalizing from previous examples, abstracting essential characteristics from inputs containing irrelevant data [12].
Neural networks, as health analytic technologies have been successfully deployed in the healthcare domain.These are typically used for classification and pattern recognition applications of electrocardiography, electromyography, therapeutic and drug monitoring, simulations of medical devices, and analysis of temporal patterns of physiological parameters [12].Other applications include use of neural networks in modeling patient arrivals at the emergency department and studying the variables directly associated with patient arrivals [13], diagnosing diabetes on small mobile devices [14], evaluating service quality dimensions as antecedents to patient satisfaction [15], diagnosis of Down syndrome in unborn babies (1994) and predicting heart diseases [16].

The neural network model
The configuration of the neural network technology consists of a series of processing elements named neurons that are inter-connected via synapses.The neurons are set in layers to form a network.Each neuron gets data from the surrounding neurons, performs computations on the data, and passes on the results to the other neurons [15].Each of the connections between neurons has an associated weight.
A neural network that consists of three layers that are connected to each other -an input layer, intermediate or hidden layers, and an output layer, is referred to as a Multilayer Perceptron (MLP).We use a multilayer feedforward network in the current research.In this, the information signals move in a forward direction through the network.The number of neurons at the input layer is guided by the number of independent variables, while the number of neurons at the output layer correlates with the number of values that need to be predicted.However, there are no widely accepted rules for determining the optimal number of hidden units.If there are fewer than optimal number of hidden units, the network will not be able to learn the input-output mapping.If there are too many hidden units, the network will generalize poorly on unseen data.Although most problem solving approaches adopt a two-layer approach for weights, the determination of an optimum network configuration often involves a trial and error approach [12].
The input values are presented to the network as a data set of input and output values.The neural network is trained to assign appropriate weights for the connections in order to yield the outputs.The back propagation algorithm is the most commonly used method of training [17,18].The training works by presenting sets of inputs to the network; having the network determine the weights between connections; and having the network calculate the outputs.These calculated outputs are then compared to known values to determine the accuracy of network prediction.Error signals are created from the comparison.Then through the back propagation process, these errors are propagated backward through the layers and the network weights updated appropriately.Different training runs are performed until the calculated outputs get close to the actual outputs.Through the training therefore the network learns to adjust the weights in a feedforward, backpropagation style so as to successively minimize the difference between the actual and the predicted outputs.In this study, Multilayer Perceptron (MLP) feed forward network architecture was used and trained with the error back propagation algorithm.

Data collection
The health data for the study was extracted from the Statewide Planning and Research Cooperative System (SPARCS) of the New York State Department of Health (NYSDOH).We extracted 200,000 patient records for the period 2009 to 2012.Data from such a large dataset is naturally characterized by a degree of incompleteness and fuzziness.Neural networks, because of their ability to learn from the data and to generalize and respond to expected inputs, are appropriate for this problem domain characterized by fuzzy and incomplete data.
The data includes the hospitalization indicators such as facility ID, procedure description, type of admission, patient disposition upon discharge, APR severity of illness, source of payment, and age group.The indicators for treatment result include length of stay and APR risk dramatically.The highest classification accuracy of 90% was achieved with the grouping category 3 (lengths of stays of 1-4 and >4 days) (Figure 3).However, since this grouping category included only two ranges of lengths of stay, we felt it could cause ambiguity in the implications.So we decided to adopt the more specific grouping category 2 (lengths of stay of 1-3, 4-9, and >9 days) with the accuracy of 82.6% for further experimentation (Figure 4).
The tests for hidden layer 1 with 5 nodes (Figure 5) and with 20 nodes (Figure 6) are shown.Since both these results are lower than previous experiments, we decided to use the default number of hidden layers for further experimentation.We tested for 2 hidden Layers with layer 1 having 15 nodes and layer 2 having 25 nodes.The classification accuracy is 82.6% (Figure 7).The accuracy of each grouping category for length of stay has changed.The overall accuracy of long-term stay increased which is critical to the whole model.We therefore decided to adopt this model with the configuration of 2 hidden layers -layer 1 having 15 nodes and layer 2 having 25 nodes.We used the partitioning rate of 55% of the data set for training and 45% for testing.With this, the accuracy increased by 0.1% to 82.8%.Moreover, the importance of the predictor variables changed as well (Figure 8).
We then used the 70-30 partitioning with 70% of the data set for training and 30% for testing.Using this partitioning, the accuracy reduced by 0.4% to 82.2%.The patient disposition variable became the most important predictor in addition to Facility ID (Figure 9).Therefore the partitioning rate of 55-45 was adopted for the final model.
We then used the Auto Classifier model which chooses relatively rational algorithms to examine the results generated by the neural net.The comparative performance of the models is shown in Figures11-13.Each algorithm has the corresponding bar chart which functions like the classification panel and displays the cross-classification of observed versus predicted value.As seen in Figure 11, the neural net model predicted with the highest classification accuracy (82.547%) Figure 14 shows the length of stay for the grouping category 2 (with length of stay 1-3; 4-9; >9) for various facilities.We can see that most of the patients in Helen Hayes Hospital (Facility ID: 775) tend to have a length of stay between 4 to 9 days.We also see that common procedures require a short length of stay of 1-3 days.As Figure 15 shows, Perc Translum Cor Angio, the most common procedure, requires less than 4 days of hospital stay in over 95% of the cases.Knowledge of this allows patients to be prepared for treatment at a particular facility.

Scope and limitations
Our research does have some limitations.First, our study covers the time period 2009-2012 while other studies could potentially cover a larger time span.Second, the data is extracted at a state level (New York) and covers only coronary atherosclerosis disease, which limits the generalizability of the results.Future studies may be conducted at a more comprehensive country or global level with more extensive coverage of health conditions.Third, it is possible that there are other intervening variables that may help better explain the phenomenon of of mortality (Table 1).We looked for strategic associations between the indicators of hospitalization and treatment results.

Selection of analytic tool
Our data analytic tool was IBM's SPSS Neural Networks (formerly known as PASW Neural Networks) with its association rules.
We propose the following in our research: There are significant differences in length of stay and mortality rates depending on the treatment procedure.
Under similar health conditions, patients who are under public health care (source of payment) tend to stay longer in hospitals than patients in other forms of care.
Treatment result is strongly associated with type of procedure and patient disposition upon discharge

Analysis and Results
We used the SPSS Neural Network and the Auto Classifier Model to analyze the dataset.We describe our analyses in the different phases of model building and training, and testing.

Model building and training
Neural network was chosen since it works best with noisy and fuzzy data.Two models are used in this study -Neural Network and Auto Classifier Model.Independent variables are selected according to the priority assigned to variables.The dataset was partitioned using iterations of 55-45%, 60-40%, and 70-30% for training and testing with auto-set number of hidden layers and nodes.Since we have a comparatively large dataset, we could adopt the most strict partition rate for the neural net.If the model functions well under such conditions, it illustrates that the association is explicit and solid.The neural network builds the model by first learning from the potential correlation between independent (hospitalization) and dependent (treatment result) variables.It then validates the model results by comparing the predicted values with the actual values.

Testing
The iterations of 45%, 40% and 30% of the data set to test the training results for prediction were adopted to represent strict, moderate and loose conditions respectively.The Auto Classifier Model was used to explore possible classification models other than Neural Network for similar predictions using different approaches.The aggregate results are compared to determine the best approach.
We set the treatment results variables of Length of Stay and APR Risk of Mortality as the target/dependent (output) variables, and all other hospitalization variables as the predictor/independent (input) variables.Neural network models were run separately for each dependent variable.
Six most important predictors for each of the dependent variables are chosen to run the models again.The overall accuracy with length of stay was only 44.8% (Figure 1).The overall accuracy with APR Risk of Mortality was 75.8% (Figure 2) which is far below the acceptable standard for a significant prediction.
In order to try to improve the accuracy rate, we standardized and grouped the length of stay in several ways (Table 2).
Using standardized length of stay did not improve accuracy at all.However, using the grouped categories increased the accuracy  hospitalization and treatment.Fourth, it is possible for the data to be skewed which may impact the analyses and results.

Conclusions and policy implications
Our results show that there are significant differences in length of stay and mortality rates depending on the treatment procedure.The findings also indicate that treatment result shows a strong association with procedure and with patients' disposition upon discharge.The length of stay and risk of mortality are significantly lowered if patients are treated in more qualified facilities and admitted in less severe conditions.An interesting revelation is that under similar health conditions, patients who are under the public healthcare system tend to have longer lengths of stay than others.
Our results offer the basis for having sound metrics for evaluating medical facilities, keeping in mind the wellbeing of the patients.It also offers a portfolio of factors to be considered in evaluating patient health outcomes from hospitalization.The discrepancy in the length of stay between public healthcare and others suggests effective utilization of government investment in healthcare.Through proper identification and analyses of factors associated with abnormal treatment indicators (such as abnormal length of stay or high risk of mortality) we offer suggestions to healthcare entities to adopt an appropriate protocol for hospitalization.Exercising caution in selecting procedures that require longer hospital stays would not only help manage and improve hospital resource allocation but would also enhance patient affordability.Also, unusually long or short length of stays may be indicative of exaggerated charges or careless treatment respectively.

Future Research
Pursuit of high-quality, affordable, health care services entails                 More potential for analyses exists by considering the therapeuticoriented causes for hospitalization as additional grouping criteria for length of stay.Therapeutic recreation is a technique that helps patients with medical conditions, mental conditions, physical challenges and developmental disabilities, engage in activities that integrate them into the community while at the same time providing motivation for treatment.
Future studies can investigate causality in addition to association, encompass a longer time span, and deploy more sophisticated techniques such as panel analysis, in explaining the role of hospital charges in the overall phenomenon of healthcare expenditure.Also, future studies can be done at a national or global level, and can also incorporate the cultural and/or social factors that impact healthcare.

Figure 10 :
Figure 10: Results of Training and Testing for the Best Model.

Figure 11 :
Figure 11: Result of Auto Classifier Model.

Figure 12 :
Figure 12: Result of Bayesian Network in Auto Classifer Model.

Figure 13 :
Figure 13: Result of Decision Tree in Auto Classifier Model.

Table 1 :
Variables Used in the Research.
other than acute care level (excluding leave of absence days) (Discharge Date -Admission Date) + 1. Length of Stay greater than or equal to 120 days has been aggregated to 120+ days; Values range from 1 to 200.APR Risk of Mortality All Patient Refined Risk of Mortality: 1-Minor; 2-Moderate; 3-Major; 4-Extreme Volume 6 • Issue 6 • 1000209 J Health Med Inform ISSN: 2157-7420 JHMI, an open access journal

Table 2 .
Grouping Categories for Length of Stay.