Inpatient Flow Distribution Patterns at Shanghai Hospitals.

Empirical studies based on patient flow data are needed to provide more materials to summarize the general pattern of patient distribution models. This study takes Shanghai as an example and tries to demonstrate the inpatient flow distribution model for different levels and specialties of medical institutions. Power, negative exponential, Gaussian, and log-logistic models were used to fit the distributions of inpatients, and a model of inpatient distribution patterns in Shanghai was derived, based on these four models. Then, the adjusted coefficient of determination (R2) and Akaike information criterion (AIC) values were used to assess the model fitting effect. The log-logistic function model has a good simulation effect and the strongest applicability in most hospitals. The estimated value of the distance-decay parameter β in the log-logistic function model is 1.67 for all patients, 1.89 for regional hospital inpatients, 1.40 for tertiary hospital inpatients, 1.64 for traditional Chinese medicine hospital inpatients, and 0.85 for mental hospital inpatients. However, the simulations at the tumor, children’s and maternity hospitals, were not satisfactory. Based on the results of empirical analysis, the four attenuation coefficient models are valid in Shanghai, and the log-logistic model of the inpatient distributions at most hospitals have good simulation effects. However, further in-depth analysis combined with the characteristics of specific specialties is needed to obtain the inpatient model in line with the characteristics of these specialties.


Shanghai Medical Service System
As one of the largest cities in China and the medical center of eastern China, Shanghai has the highest level of medical technology in China, with relatively abundant medical resources. There are approximately 5.74 beds and 3.0 doctors for every thousand Shanghai residents [1]; both statistics are highly ranked among all the cities in China. In Shanghai, medical institutions are divided into three categories: primary medical institutions, regional medical institutions, and tertiary medical institutions. Primary medical institutions mainly provide basic medical services, which are limited to neighboring residents. Regional medical institutions mainly provide outpatient and inpatient services for common and frequently occurring diseases. Tertiary medical institutions mainly provide outpatient and inpatient services for difficult diseases and treat patients, including, but not limited to, Shanghai residents [2]. Most medical institutions, especially regional medical institutions and tertiary medical institutions, are public. The government has strict restrictions on the pricing of medical service. The loss of public hospitals are paid by the financial department of the government. All public medical institutions can enjoy medical insurance policy. Currently, a main problem of the medical service system in Shanghai is that the total amount of medical resources is relatively insufficient and the structure of medical resources utilization is unreasonable. In 2018, Shanghai's total per capita health spending was about $1232. China's total per capita health expenditure is about $688; in the USA it is $10,586, and Australia $5395 [3]. Compared with the United States and other western developed countries, the per capita health expenditure of Shanghai still lags behind, to some extent. From the composition of health expenses, in 2018, the total health expenditure in Shanghai reached $5.7 trillion dollars, of which the health expenses in hospitals accounted for 78% and the health expenses in primary medical institutions accounted for about 15%. Among the hospital health expenses, tertiary hospital account for 71%, secondary hospital account for 28%, and primary medical institutions account for about 1% [1]. At present, a new round of medical reform has been developed. The government currently advocates graded diagnosis and treatment [4], which means encouraging residents to choose appropriate medical institutions according to disease severity, community hospitals for minor diseases, and tertiary medical institutions for major diseases to improve the utilization efficiency of medical resources at all levels. Local hospitalization has been recommended by many countries because it reduces travel and medical cost for patients, balances the supply and demand of healthcare services, maintains local hospital revenue, and improves patient-doctor relationships overall [5]. The study on the flow of patients in this paper is helpful for understanding the medical behavior habits of the population and provides a reference for adjusting medical behavior patterns.

Quantitative Model for Descripting Patient Flow
The gravity model can be used to analyze and predict spatial flows. It holds that phenomena are interrelated and restricted in geographic space. According to the distance attenuation law, if geographical phenomena interact with each other, the action decreases with increasing distance. In the area of health, understanding patient travel behaviors for seeking hospital care is fundamental for analyzing the healthcare market and planning resource allocation [6]. However, patient flow analysis is complex and requires actual traffic flow data, which are difficult to obtain. The key parameters, such as the distance attenuation coefficient, usually vary from temporally and spatially. There are relatively few empirical studies on the gravity model for patient populations [7]. Therefore, empirical studies based on patient-flow data in increasingly more places are needed to provide more references for other similar areas, and more materials to summarize the general pattern of patient-distribution models [8,9]. In addition, some studies have begun to focus on the heterogeneity of medical services [10]. One study examined the differences in medical treatment patterns among different ages, genders, and ethnic groups and determined the specific distance decay parameter on hospital inpatient visits among subpopulations [6,11]. Factors, such as age and gender, are not direct factors influencing the choice of medical treatment. The direct reasons influencing the choice of medical treatment are disease type and severity, while factors, such as age and gender, are the reasons influencing disease type and severity [12]. Therefore, considering individual disease, it is more appropriate to study the differences in medical treatment patterns based on the classification of diseases. Considering disease severity, it is more appropriate to study differences in medical treatment patterns based on the institutional classification. This study takes Shanghai inpatients as an example and attempts to demonstrate the inpatient flow distribution model of different levels and specialties of medical institutions in Shanghai, as well as provide empirical results for reference.

Population Data
In 2016, Shanghai's resident population was approximately 24 million people. The resident population of each residential building in Shanghai was calculated from the Public Security Department of Shanghai. By locating the residential buildings, the population distribution can be determined [13].

Inpatient Flow Data
The data of inpatients in Shanghai were collected from the first page of the inpatient medical record from each medical institution in Shanghai in 2016. This page records the information of each inpatient, including the patient's residential address, medical institution, major disease diagnoses, and treatment(s). In 2016, the total number of inpatients in all medical institutions in Shanghai was approximately 3 million people, including approximately 2.5 million local residents and 0.5 million non-local residents. The data were from the Shanghai Health Bureau. Considering the limitations of the study area, this paper only used the data of the local residents in Shanghai when calculating the distribution pattern of inpatient flow in Shanghai.

Hospital Information
The information of medical institutions came from the Shanghai Health Bureau. In 2016, Shanghai had approximately 5629 medical institutions, including 5433 primary medical institutions, 160 regional medical institutions, and 36 tertiary medical institutions. Inpatient services were mainly provided by regional medical institutions and tertiary medical institutions. This study mainly focused on the medical institutions providing inpatient services, and then we collected these institutions' characteristics, including the level, type, number of beds, etc. In the specific analysis below, we mainly divide medical institutions into the following categories. According to the level of the institution, medical institutions are divided into secondary medical institutions and tertiary medical institutions. According to the specialty, medical institutions are divided into traditional Chinese medicine (TCM) hospitals, orthopedic hospitals, mental hospitals, tumor hospitals, children's hospitals, and maternity hospitals.

Identify Demographic Units
The population data collected were based on the point-like data of residential buildings. To make better use of the population data based on residential buildings, this paper meshes the population data of residential buildings. First, the city was divided into several squares with sides equal to 1000 m, which serve as the basic geographical unit of Shanghai's population statistics. Subsequent calculations of patient flow were also based on this geographic unit. As a result, Shanghai was divided into 6699 geographical units, including many irregular grids of geographic units close to administrative boundaries. Incomplete geographical units were caused by the irregular regional boundaries of Shanghai, and they would not hinder the subsequent calculations of network distance. Each geographic unit was identified by a unique code (i) that facilitates subsequent calculations.

Establishment of the Patient Distribution Flow Analysis Database
Combining population data, inpatient data, and hospital bed information, a database of the distribution flow matrix of patients was established. The specific way to integrate the three databases is as follows.
(1) Establish the inpatient flow database (T ij ). First, according to the home address information of inpatients in the inpatient database, the spatial locations of the inpatients were identified. Second, with the help of geographic information systems, the spatial relationship between the patient's home address and geographic unit was calculated. Then, the number of patients accepted by each medical institution in each geographic unit were summarized and counted. (2) Count the population in each geographic unit (P i ). According to the spatial relationship between residential sites and geographical units, the resident population of each geographical unit was calculated. (3) Calculate the distance matrix between each geographic unit and each hospital (d ij ). First, each medical institution was given a unique identification code. Second, traffic road information was superimposed, and the road network distance between each geographic unit and medical institution was calculated by using ArcGIS to form the distance matrix of geographical units (i) and medical institutions (j). (4) Add the population information from step 2 (P i ) to the inpatient flow database (T ij ). The unique identifier of the geographic units was used as the connection field to add the geographic unit population information attributed to the inpatient flow database. (5) Add the distance information from step 3 (d ij ) to the inpatient flow database (T ij ). The geographic unit and the hospital unique identifiers were used as connection fields to add the distance attribute between the geographic units and the hospitals to the inpatient flow database. (6) Add the hospital bed information (S j ) to the inpatient flow database (T ij ). The unique hospital identification code was used as a connection field to add attributes, such as the number of beds from the hospital database, to the inpatient flow database.
Based on the above steps, we established the patient flow database, which included key data such as the population, supply, distance, and flow.

Nonlinear Fitting of Inpatient Flow Patterns
The interaction between residents and hospitals, measured by the volume of discharges from hospital j to demographic unit i, denoted by T ij , was formulated as a gravity model: where P i was the population in each geographical unit i. S j was the number of beds in hospital j. d ij was the travel distance from geographical unit i to hospital j in kilometers, and f d ij was a generalized distance decay function. µ was a scalar. α was the parameter describing the effect of the population numbers in each geographic unit on the interaction. σ was the parameter describing the effect of the number of hospital beds on the interaction. How can the distance decay function f d ij be defined?
The power [14], negative exponential [10], Gaussian [15], and log-logistic distributions [16] were popular, and can be found for describing the distance decay of hospital utilization in the literature [17].
In the empirical study of the Shanghai inpatient pattern, these four functions were used as distance attenuation functions for fitting. The specific functions were listed in Table 1. The unknown parameters (µ, α, σ, β) in these models can be estimated by nonlinear regression. The adjusted coefficient of determination (R 2 ) and Akaike information criterion (AIC) values were used to test the fitting effect of the models [18]. SAS 9.4 was used to fit the models. The adjusted R 2 was defined as the portion of dependent variable's variation explained by a nonlinear regression model. The AIC was used to measure the relative quality of the models due to varying complexities of four functions: four parameters when a power, exponential, and Gaussian model was used, and five parameters in the log-logistic model. The model with the maximum adjusted R 2 and minimum AIC was selected as the best-fitting model. was calculated by using ArcGIS to form the distance matrix of geographical units (i) and medical institutions (j). 4) Add the population information from step 2 ( ) to the inpatient flow database ( ). The unique identifier of the geographic units was used as the connection field to add the geographic unit population information attributed to the inpatient flow database.
5) Add the distance information from step 3 ( ) to the inpatient flow database ( ). The geographic unit and the hospital unique identifiers were used as connection fields to add the distance attribute between the geographic units and the hospitals to the inpatient flow database. 6) Add the hospital bed information ( ) to the inpatient flow database ( ). The unique hospital identification code was used as a connection field to add attributes, such as the number of beds from the hospital database, to the inpatient flow database.
Based on the above steps, we established the patient flow database, which included key data such as the population, supply, distance, and flow.

Nonlinear Fitting of Inpatient Flow Patterns
The interaction between residents and hospitals, measured by the volume of discharges from hospital j to demographic unit i, denoted by , was formulated as a gravity model: Where was the population in each geographical unit i. was the number of beds in hospital j.
was the travel distance from geographical unit i to hospital j in kilometers, and ( ) was a generalized distance decay function. μ was a scalar. α was the parameter describing the effect of the population numbers in each geographic unit on the interaction. σ was the parameter describing the effect of the number of hospital beds on the interaction. How can the distance decay function ( ) be defined? The power [14], negative exponential [10], Gaussian [15], and log-logistic distributions [16] were popular, and can be found for describing the distance decay of hospital utilization in the literature [17]. In the empirical study of the Shanghai inpatient pattern, these four functions were used as distance attenuation functions for fitting. The specific functions were listed in Table 1. The unknown parameters (μ, α, σ, β) in these models can be estimated by nonlinear regression. The adjusted coefficient of determination (R 2 ) and Akaike information criterion (AIC) values were used to test the fitting effect of the models [18]. SAS 9.4 was used to fit the models. The adjusted R 2 was defined as the portion of dependent variable's variation explained by a nonlinear regression model. The AIC was used to measure the relative quality of the models due to varying complexities of four functions: four parameters when a power, exponential, and Gaussian model was used, and five parameters in the log-logistic model. The model with the maximum adjusted R 2 and minimum AIC was selected as the best-fitting model.

Results
In the empirical analysis of the Shanghai inpatient flow pattern, the power function, the negative exponential function, the Gaussian function, and the log-logistic function models were used to simulate and analyze the distribution of patients at different medical institutions of different levels and specialties in Shanghai. The simulation results were shown in Table 2. For all the hospitalized

Results
In the empirical analysis of the Shanghai inpatient flow pattern, the power function, the negative exponential function, the Gaussian function, and the log-logistic function models were used to simulate and analyze the distribution of patients at different medical institutions of different levels and specialties in Shanghai. The simulation results were shown in Table 2. For all the hospitalized patients, tertiary hospital inpatients, regional hospital inpatients, the traditional Chinese medicine hospital inpatients, orthopedic hospital inpatients, and mental hospital inpatients, the four distance-decay functions were valid, among which the log-logistic model had the best-fitting effect. In contrast, for inpatients at tumor hospitals, children's hospitals, and maternity hospitals, the power function, the negative exponential function, and the Gaussian function models were feasible, but the model fitting effect was poor; the log-logistic function model was not feasible. In general, from the perspective of model selection, the log-logistic function model had a good simulation effect and the strongest applicability in most hospitals.
The results of comparing the distribution pattern of inpatients among different levels of hospitals were as follows. Figure 1 showed the differences in the inpatient distribution patterns among tertiary hospitals and regional hospitals captured by the fitted log-logistic function models. The optimal log-logistic curve for each subgroup was drawn with P i and S j both set to 1000. As the results show, regional hospital patient flow increased rapidly with increasing distance attenuation (β = 1.89), and tertiary hospital patient flow increased significantly more slowly with increasing distance attenuation speed (β = 1.40). The decline rate of patient flow with increasing distance in all hospitals was between the two (β = 1.67).
The results of comparing the distributions of inpatients among different specialties of hospitals are as follows. Figure 2 showed the difference in the inpatient distribution patterns among traditional Chinese medicine hospitals, orthopedic hospitals, and mental hospitals captured by the fitted log-logistic function models. The optimal log-logistic curve for each subgroup was drawn with P i and S j both set to 1000. As the results show, the increase in patient flow at traditional Chinese medicine hospitals with increasing distance attenuation was the fastest (β = 1.64), and that at the mental hospital and orthopedic hospital was slower (β = 0.85).
In general, this paper analyzed the differences in the patient distribution model of hospitals of different levels and analyzed the differences in the patient distribution models for different specialized hospitals. Based on these analyses, the model of the inpatient distribution in Shanghai and the values of key parameters were given (Table 3).   Figure 1. Inpatient distribution pattern among tertiary hospitals and regional hospitals captured by the fitted log-logistic function model.

Figure 2.
Inpatient distribution pattern among tertiary hospitals and regional hospitals captured by the fitted log-logistic function model.

Discussion
In terms of the model selection of patient distribution patterns, the empirical study of Shanghai and previous studies reached similar results, and the log-logistic function model was found to be the best-fitting model for the patient distribution. Some studies have conducted empirical studies on the inpatient distribution in Florida [5] and obtained the fitting results. Compared with the attenuation coefficient of hospitalized patients in Florida, the distance attenuation rate of all hospitalized patients in Shanghai was slower, as the distance attenuation coefficient of all hospitalized patients in Shanghai was 1.67, and that of hospitalized patients in Florida was 2.14. There may be many reasons for the distance attenuation coefficient difference, such as the service area, population density, and concentration of medical resources. All of these factors will result in different distance attenuation coefficients.
By comparing the differences in the distribution patterns of hospitalized patients in regional and tertiary medical institutions in Shanghai, it can be found that the distance attenuation coefficient of hospitalized patients at regional hospitals is larger than that of patients at tertiary hospitals, which means that the distribution attenuation rate of hospitalized patients in regional hospitals is faster. Tertiary hospitals always have more medical resources with better medical quality and advanced medical level, resulting in a larger range of patients, so residents are willing to pay more for medical treatment [19][20][21]. In a previous study, a tertiary hospital in Shanghai was taken as an example, and the attenuation coefficient of inpatients at this hospital was verified by using an exponential function model based on the distribution data of inpatients. The distance attenuation coefficient for this tertiary hospital in Shanghai was 0.7 [7]. The results were similar to the estimated distance attenuation coefficient of inpatients in tertiary hospitals in this paper, which showed that the estimated value of β is 0.95 under the power function model, and β is 1.4 under the log-logistic function model.
By comparing the differences in the distribution patterns of inpatients at specialized hospitals in

Discussion
In terms of the model selection of patient distribution patterns, the empirical study of Shanghai and previous studies reached similar results, and the log-logistic function model was found to be the best-fitting model for the patient distribution. Some studies have conducted empirical studies on the inpatient distribution in Florida [5] and obtained the fitting results. Compared with the attenuation coefficient of hospitalized patients in Florida, the distance attenuation rate of all hospitalized patients in Shanghai was slower, as the distance attenuation coefficient of all hospitalized patients in Shanghai was 1.67, and that of hospitalized patients in Florida was 2.14. There may be many reasons for the distance attenuation coefficient difference, such as the service area, population density, and concentration of medical resources. All of these factors will result in different distance attenuation coefficients.
By comparing the differences in the distribution patterns of hospitalized patients in regional and tertiary medical institutions in Shanghai, it can be found that the distance attenuation coefficient of hospitalized patients at regional hospitals is larger than that of patients at tertiary hospitals, which means that the distribution attenuation rate of hospitalized patients in regional hospitals is faster. Tertiary hospitals always have more medical resources with better medical quality and advanced medical level, resulting in a larger range of patients, so residents are willing to pay more for medical treatment [19][20][21]. In a previous study, a tertiary hospital in Shanghai was taken as an example, and the attenuation coefficient of inpatients at this hospital was verified by using an exponential function model based on the distribution data of inpatients. The distance attenuation coefficient for this tertiary hospital in Shanghai was 0.7 [7]. The results were similar to the estimated distance attenuation coefficient of inpatients in tertiary hospitals in this paper, which showed that the estimated value of β is 0.95 under the power function model, and β is 1.4 under the log-logistic function model.
By comparing the differences in the distribution patterns of inpatients at specialized hospitals in Shanghai, it can be found that the distribution model of patients in specialized hospitals is more complicated. In this paper, the distribution fitting effect of patients in tumor hospitals, children's hospitals and maternity hospitals was not good. This phenomenon may be due to the different service objectives and contents of different specialties, resulting in poor comparability between the distribution patterns of patients among the different specialties. At the same time, there were large differences in the technical levels among specialized hospitals, which may further lead to a poor fitting effect of the distribution of patients at specialized hospitals. It should be noted that the inpatient distribution model estimated from the specialist hospitals in this article cannot completely replace the distribution of all specialist inpatients because many inpatients for specific specialties go to general medical institutions instead of specialized hospitals. This kind of inpatient data was not included in the analysis of the inpatient distribution model in specialized hospitals in this paper.
The main advantages of this paper were that it collected the actual inpatient distribution data in Shanghai and demonstrated the distribution pattern of inpatients in Shanghai. At the same time, according to the different levels and specialties of medical institutions, the distribution patterns of inpatients at different levels and different specialized hospitals were explored. This study provided more empirical evidence for patient distribution models. In addition, it provided a research basis for further analyzing scientific issues, such as service area, market scope, and medical resource access at medical institutions [22].
The main disadvantage was that only local Shanghai inpatients were included in the empirical analysis of the Shanghai inpatient distribution pattern. However, Shanghai, as the medical center for eastern China, can reach areas beyond Shanghai. Therefore, this study was limited by including only Shanghai patients to analyze the distribution model of patients at Shanghai medical institutions. However, the inpatient distribution model for regional medical institutions was not affected by this choice because regional hospitals mainly serve local Shanghai residents. In future studies, the scope of the study can be expanded, and the fitting results may be more in line with the distribution characteristics of inpatients in medical institutions in Shanghai.

Conclusions
This article toke Shanghai as an example to analyze the actual distributions of inpatients, used the power function, the negative exponential function, the Gaussian function, and the log-logistic function model to fit the distribution of inpatients, and derived a model of inpatient distributions in Shanghai based on the four models. Finally, the adjusted R 2 value and AIC value were used to determine the model with the best-fitting effect. Based on the results of empirical analysis, we believe that the four attenuation coefficient models are valid in the Shanghai empirical study, and the log-logistic function model of the inpatient distributions at most hospitals has a good simulation effect. The model of the inpatient distribution in Shanghai and the values of key parameters were given in Table 3. For all the hospitalized patients in Shanghai, the recommended distance attenuation coefficient value based on the log-logistic function model is 1.67. For inpatients at tertiary hospitals, the recommended distance attenuation coefficient value based on the log-logistic function model is 1.4. For inpatients at regional hospitals, the recommended distance attenuation coefficient value based on the log-logistic function model is 1.89. For inpatients at TCM hospitals, the recommended distance attenuation coefficient value based on the log-logistic function model is 1.64. For inpatients at orthopedic hospitals, the recommended distance attenuation coefficient value based on the log-logistic function model is 0.85. For inpatients at mental hospitals, the recommended distance attenuation coefficient value based on the log-logistic function model is 0.85. As an important parameter in the patient distribution model, distance attenuation coefficient can help researchers intuitively understand the distribution attenuation rate of different types of patients. The recommended distance attenuation coefficient can provide reference values for key parameters of the patient distribution model for extended studies. However, the simulations of the inpatient distribution patterns for the tumor hospitals, children's hospitals, and maternity hospitals in Shanghai were not good. Therefore, further in-depth analysis, combined with the characteristics of specific specialties, are needed to obtain the inpatient distribution model in line with the characteristics of these specialties.
Author Contributions: All authors contributed to the design of the study. L.L. and X.X. conceived, designed, and implemented the analysis, including the exploratory data analysis and visualization of data and data processing. X.X. collected and analyzed the research data. X.X. drafted the manuscript. L.L. and X.X. revised the paper. L.L. and X.X. read and approved the final manuscript. All authors have read and agreed to the published version of this manuscript.

Funding:
The National Natural Science Foundation of China (No. 71874033). The funding help in the design of data collection, data analysis, and writing the manuscript. Firstly, the funded project helped establish a scientific cooperative relationship between the research team of the authors and the Shanghai municipal commission of health and family planning, which helped to obtain the key data for this paper. Secondly, the project funds helped support the fees related to the expert consultation and literature retrieval in the process of forming the ideas of this paper. In addition, project funding supported the ArcGIS software in this study.