a Estimation of Injuries in a Population-based Study by Stratified Cluster Sampling and Network Scale-up Methods

Introduction: of most important of and an important challenge for the health and for 28% of the deaths per Iran. Iran is in the 5th for road traffic death globally. Objective : This was a cross sectional study aimed to estimate annual incidence rates of injury in individuals above 15 in Kashan, Iran. Methods: In this population-based cross-sectional study, people above 15 years in households residing in Kashan during 2018-2019 were studied The sample had a twofold design, Stratified Cluster Sampling method and network Scale-up method . Data analysis was performed using SPSS 22 (Chi-square and T-test and Logistic regression analysis) and Stata 14 software. Results: This study describes the study setting, sampling, data collection and management, quality control and assurance, and the statistical analysis. In this study, the annual incidence rate from all kinds of trauma was estimated at 70.61(62.60-78.70) per 1000 person-years, and the death caused by injury was 4.63(34.32-56.12) per 1000 person-years. In addition, in network scale-up method incidence rate of all kinds of trauma was 57.1 and the death caused by trauma was 3.47 per 1000 person-years. Conclusion: population-based studies were used to evaluate injuries and outcomes of trauma, so that managers realize where attention is needed regarding preventive programs in this city. Using indirect sampling techniques can be cost-effective and time-consuming, and can provide us with the information urgently needed.


Introduction
oday, one of the most important causes of morbidity and mortality among people across the world is Injury 1 , Error! Reference source not found. and trauma accounts for 10% of the burden of diseases 2, 3 . The results of other studies demonstrated that traffic injuries, falls, and burns are the most important causes of injury in Iran [1][2][3][4] . Injuries are an important challenge for the health system and account for 28% of the deaths per day in Iran 5 . Iran is in the 5th place regarding road traffic mortality rates globally, and has the highest mortality rate in the Eastern Mediterranean region 6 . There are different methods for studying the epidemiology of injuries such as trauma registries, hospital-based databases, and injury surveillance data which can be used to estimate injuries, but only populationbased study can estimate all types of injuries 7,8,9 .
To obtain complete and comprehensive population-based information, this study was T conducted to evaluate the epidemiology of the types of trauma and their consequences in individuals above 15 in Kashan with a comprehensive survey and post-discharge treatment or care interventions for better management regarding the trauma patients and minimizing the level of post-trauma outcomes, especially disabilities. On the other hand, in developing countries, due to the high burden of traumatic events, a detailed trauma information system is required. While there is no complete system for recording, injuries cannot always be quantified by direct sampling (counting and census), so estimating injuries was done by indirect sampling methods one of the indirect methods is Network Scale-up method. Since the Network Scale-up method does not require specific information sources and direct contact with the the sample under study, it is a suitable way to study trauma in society. Therefore, in this study, the network Scale-up method is used along with the Stratified Cluster Sampling 10,11,12 .
The network Scale-up method is one of the new methods for estimating statistical uncertainty in populations. This method consists of two parts: estimating the social network of individuals in the community and estimating the number of subpopulations. This method is estimated based on the average number of respondents in the hidden groups, the average size of the respondents' network, and the proportion of people at risk in the community 13,14 .
The town of Kashan located in Isfahan province, one of the central cities of Iran, extending from the north to south and from east to west of Iranwhose distance from the capital is 9647 km and is an urbanized and industrial town with a population of about 448063, and 132101 families. This was a cross sectional study aimed to estimate annual incidence rates of injury in a population above 15 in Kashan.

Study setting
In this population-based cross-sectional study, individuals above 15 residing in Kashan during 2018-2019 were studied. In this study, the sample had two methods :a two-stage Cluster stratified design , and a network Scale-up method 5,16,17 .In stratified cluster sampling methods, Kashan was divided into five areas according to the socioeconomic status on the geographical map of Kashan,and cluster of each area was defined on the map. Figure 1 shows the Cluster of each area. According to the population of each area, the sample size was determined in 5 areas. This sample size was divided into 25 individuals per cluster to determine the required number of clusters. All clusters in each area were numbered, and the clusters were randomly selected. In each cluster, between five houses in the first cluster, one house randomly has been selected, and systematically the 25 houses were next, have been surveyed. From all Clusters in each area, 25 households were studied 18,19 .

23
The researcher randomly referred to each of the households and randomly select one person above 15 years old age and Kish Grid tables were used to interview and complete the questionnaire after obtaining informed consent 22 . In the absence of the selected person or failure to cooperate for the first time, interviews were referred to this person three times to collect information and if she/he did not cooperate Interviews were referred to this person three times to collect information and if she/he did not cooperate with research team she/he to be removed of study and first neighboring home on her right to be replaced. Household surveys were conducted in the native language of Kashan (Persian) by trained -research assistants from Kashan University of Medical sciences 17,18,19 . Figure 2 shows the flowchart of the study designed.

Sample size
According to the incidence of all injuries in one year (p), which was 25 per 1000 person-years in 2013, the following formula to estimate the minimum needed sample size was usedError! Reference source not found. 20 .
The minimum number of subjects to assess the annual incidence of trauma, considering the prevalence of trauma Considering the prevalence of trauma at the community ,that is 32.3% , the sample size required for the study according to design effect equivalent to 1.5 was 3880 people 17,20 .

Network Scale-up Method
In this cross-sectional study, the population of Kashan officially the population of Kashan officially Counted in 2017 was considered as the base population to study the population through the network Scale-up method. In each of the selected blocks in Kashan, one family was randomly selected and one person above the age of 15 willing to cooperate was interviewed. (Preferably the housewife and housekeeper and familiar with the neighborhood and neighbors) and a total of 160 interviews were conducted using a network expansion method.
Using the formula below e / N = m / c the size of each person's social network was calculated, in which formula e was the total number of traumatized people above 15 in Kashan. N was the total number of individuals in the age range studied, m the number of traumatized individuals each person knew, and c the size of each individual's social network. Finally, the maximum likelihood was estimated using social likelihood size. The following method is used to calculate the size of each person's network and the number of people with trauma, disability, and death whom each person knows.
First, with a relative accuracy of about 12 and a variance of about 450 for estimated the size of the network. In a previous study to estimate the network size with α = 0.05, a sample size of 160 was estimated.

Sampling and data collection
Trained interviewers conducted data collection and sampling. The interviewer's teams included nine groups of two people, who were trained during a 3-day workshop. In this workshop, interviewers learned about the research plan and purpose of study, the type of study, how to select individuals and clusters. Details of data collection tools were reviewed and questions were clearly answered. In addition, training regarding how to correctly deal with the subjects after the end of the course, interviewers were tested for their knowledge of their knowledge of this study. Moreover, a test was conducted by simulating the sampling position and asking questions from the simulated sample to assess the skill of the interviewers in collecting data from the questionnaires. This was planned to reduce the inter-rater variability of the assessments. Those who succeeded in at least 80% of the final exams and had a correlation of at least 80% in the responses were selected to collaborate on data collection It should be noted that in each of the three questionnaire groups, there was one supervisor to coordinate in the groups, who also monitors the teamwork. Also, at the beginning of sampling for each group, the main researcher of the project asked interviewers the first five questions regarding the area of to minimize bias in data collection.
In this study, the data collection package and the printed maps of the addresses were provided to the interviewers. The interviewer's collection package included the most important items in the research and the areas identified for the collecting data from the groups, starting point of sampling in each block ,as well as the guide for completing the interviews, which was placed on seven pages.
During the data collection, Periodic discussions were held to answer questions and resolve problems of the interviewers in the Trauma Research Center.
Sampling was done in the morning and in the evening for the interviewers and family members to be able to be present at home.
After referring to the selected homes, interviewers used identification cards to introduce themselves and briefly introduced the objectives of the study, and after asking the number of family members (i.e., the number of people who live in the building unit), interviewers used Kish grid tables to randomly select one of the family members above 15 for interview, and completed the questionnaires. If a person was absent, or there was non-cooperation during the first visit, an interviewer was referred to the house three times, 25 and if he/she did not cooperate, the interviewer would replace the first home right away.
The main data collected through a face to face interview included demographic , medical, and socioeconomic data, and If there was trauma in the past year, injury-related data such as the time, location, mechanism, type, and the number of injuries, type of treatment received, duration of hospitalization, injured body organ were obtained. The collected data related to injury was based on the International Classification of Diseases (ICD-10).
Moreover, GH-28 (General health questionnaire 28 questions) and SF-12 (Questionnaire Short Form 12 questions) questionnaires have been completed by all people in the study, and WHODAS.II (WHO Disability Assessment Schedule 2.0) and PTSD (Posttraumatic Stress Disorder Checklist) Checklist were completed by those who have had trauma during the past year. This study was conducted as a pilot study to ensure that interviewing methods were reasonable, and the validity and reliability of the questionnaires were calculated before data collection.

Network Scale-up Sampling
The network Scale-up method has been used to investigate the prevalence of trauma in the past year. Accordingly, the information on the number of people with trauma and with visible and special disabilities, such as spinal cord and limb amputation due to trauma and death from traumatic events, as well as asking how many people they know in their neighborhood (three on the right and three on the left) ,suffering from trauma and its consequent disabilities has been Collected 10,14 .
In this study, social networking means active social networking which in this study means" personal the people living in Kashan for at least one year who know each other by name and face, and see each other at least once a year (face to face, by telephone, MS or email) and the possibility of re-calling them" 13,14 .
In each interview, trauma was first defined as any injury deliberately or unintentionally caused by accident involving physical or chemical agents of the individual and the required medical attention, whether or not care service has been received. A person with spinal cord injury is a person with complete or partial paralysis of the lower trunk and lower limbs due to damage to the spinal cord or spine. A person with amputation is the one with an imputed elbow or arm and a person with amputated leg or knee. It also includes death from trauma. Then relevant questions were asked about these explanations and finally the number of people affected and also the consequences of trauma (death, spinal cord injury, amputation) were estimated using the network-based method of estimating

Data management
The data were entered into SPSS software version 22.0 database by the researcher. In this study, trauma is any intentional or unintentional physical damage caused by accidents happening to a person who needs medical care. The research team performed data cleaning. And the data which were missed or erroneous were clarified.

Quality control, quality assurance
During the data collection period, the monitoring team monitored how the questionnaire was correctly completed. In this way, periodic reviewers monitored the data collection process. Data were analyzed and cleaned by observers. Ten percent of the data was returned to the software to determine the error rate. If, after a random check, more than 10% of the information on each questionnaire was incorrect, the information completed by the participant was deleted, and his qualification was re-evaluated.

Statistical analysis
In this study, statistical analysis was performed using the SPSS (v.22) statistical software package and Stata (v.14) software. Results for qualitative variables are shown with frequencies and percentages, and results for quantitative variables are shown as mean and standard deviation. Chisquare and T-test were used to examine the differences between the two variables, and Logistic regression analysis was used to examine the variables that were significantly associated with trauma outcomes. P< 0.05 was considered to be statistically significant. Univariate analysis was used to investigate the relationship between variables and trauma outcomes, and confidence intervals were reported at 95 % throughout this study.

Network scale-up analysis
In this study, the known population was calculated by estimating the average size of the personal network from the initial equation of the network scale-up method inversely, using the maximum likelihood method and population subgroup. So, participants' responses to each of the combined subgroups and active network size were calculated separately from each response. Then, using the maximum likelihood method, the total active network size C was estimated with 95% confidence.
If the size of the social network in each of the subgroups was zero or too large and irrational, the subgroup was excluded from the analysis.
To eliminate common errors in network scale-up methods such as transmission error and barrier effect, transmission rate and popularity ratio were used in this study.

Result
In this study, 4800 households in Kashan have been visited for data collection. Of these, 4,200 people agreed to be interviewed, 180 cases of the collected data were incomplete and unusable, and140 cases were incorrect and were excluded from the study in the monitoring and evaluation phase of the study, so response rate in this study was 87.5 % and in total, 3880people residing in Kashan were surveyed with informed consent. Figure 3 shows the flowchart of sampling in this study. In the network scale-up method, to collect the required samples 180 households were visited. 160 people agreed to be interviewed, so the response rate was 88%.
In this study 274 (7.06%) individuals, reported injuries during the past year. Also, incidence rate of all types of trauma was estimated at 70.6 per 1000 person-years. 213 (77.73%) participants with injury were male. Moreover, the incidence rate of all trauma in males was 118 per 1000 person-years, and in women, it was 29 per 1000 person-years (61(22%) of people with injury were female). In 27 network scale-up method, 147 (5.7%) people reported they had injuries during the past year. The incidence rate of all types of trauma was estimated at 57.1 per 1000 person-years. 141 (51%) of the subjects with injuries, had traffic accidents trauma and 98 (66.6%) were related to other methods. Also, the incidence of traffic accidents was 3607 and 3788, respectively in stratified cluster sampling and network scale-up sampling. The incidence of trauma is generally higher in the stratified cluster sampling, but in the network scale method, the incidence of traffic traumas is more. Table 1 compares the results obtained from these two methods. In this study the annual incidence of death caused by trauma was 4.63(34.32-56.12) per 1000 person-years, from which, 88.8% occurred in men, and 72.2% were due to traffic accidents. Also in network scale-up method, the annual incidence of death caused by trauma was 3.47 per 1000 personyears. Figure 4 shows the frequency of deaths caused by trauma in stratified cluster sampling.  Table 2. Males were significantly more likely to report traumatic injury than females (OR = 2.235,95% CI = 1.175, 2.315). Compared to the participants between 30-59, those aged 15-29 were more likely to report injury (OR = 1.93, 95% CI = 1.85, 2.33). Compared to other races, participants identified as Fars were more likely to report the injury one year ago (OR = 1.98, 95% CI = 1.47, 3.84). Although marital status (married) was significantly associated with injury in this univariate analysis (Chi-square = 1.47, p = 0.00), in this adjusted model, married participants were less likely to report injury than single participants during the past year. (OR = 0.74,95% CI = 0.54, 1.08)

Discussion
This was a cross-sectional population-based study that uses two methods for sampling. There are few protocol studies in a medical study, and many of them explain results from research, but in this study, it was tried to describe the study setting, sampling, data collection and data management, quality control and assurance, and statistical analysis of the study. The objectives were estimating the annual incidence of injury and outcomes of trauma for better prevention programs and the factors that threaten health, such as trauma 21 . Also, one of the goals of this study for using the two sampling methods in this study was to compare the working methods and the advantages and limitations of these two methods. Because direct sampling methods often require a large number of samples and are time-consuming and costly, using indirect methods such as network scale-up method can reduce the number of samples, and in less time, the information needed was obtained.
In this study, the estimated incidence of trauma in one year in the network scale-up method was lower than the other sampling method. If this difference can be attributed to sampling errors, by shrinking the social network of individuals, one can get closer to the actual incidence of trauma in the community.

Limitations
The population-based study had many limitations in each stage. While the data was randomly collected, it might not be a complete 29 indicator of the society. However, the data didn't completely match the age and sex structure estimates regarding Kashan.
The injuries and post-trauma disorders reported in this study are likely to be more real as they are more memorable. But in this study could not estimate the severity of injuries.
In this type of study, families were reluctant to report certain injuries with social implications, or feared consequences of reporting events such as self-poisonings or suicidal behaviors and injuries by family violence.
Significant recall bias would lead to a decreasing disability of reporting over time, as people would recall minor injuries better in the more recent past. And injured people, months later, reported many post-traumatic disorders such as disabilities and stress less than months earlier.

Errors and limitations of the network scaleup method 1-Data transfer bias:
The people introduced may only know him/her, but are not fully aware of their true behavior, or the definition of target subgroups is quite different from the perspective of individuals. To resolve this problem, a single definition of the target group and subgroups was provided which could also be applied to the target group. The concept of trauma and post-traumatic complications was also fully explained at the beginning of the interview.
2-Relative network size: The size of the social network of individuals in the target groups may be small, or different from that of the general population. To solve this problem, during the monitoring and reviewing phase, the data were communicated to the target group, and the cases with inconsistent information were excluded.
3-Reporting bias: This error occurs because of the degree of enthusiasm or social inclination. Respondents may not answer the question. This error will vary by ethnicity and race. To counter this bias, the results were adjusted to target groups based on some variables such as age, gender, and length of stay in that area.

4-
Barrier effects: This error may be due to a combination of random and non-random individuals and unequal population distribution -Physical barriers such as differences in people's geography or social barriers such as race or gender. To minimize this error in this study, respondents were selected from individuals who have lived in the area for at least 5 years, have good public relations, and were selected from households to reflect the actual population of that area.

5-
A low reminder of the network: Respondents tend to refer to high prevalence reference groups nominally. In this study, to resolve this type of bias, the target group was contacted and asked for information again.
6-Different levels of transparency regarding reference groups: This problem arises when the size of a person's social network is indirectly determined. Transparency of phenomena (reference groups) can influence the estimation of social networks, and thus, the target groups. Considering that this study aimed to evaluate the incidence of trauma and its consequences, and that the phenomenon under consideration is fully visible and transparent, and that the level of transparency of the data was also revealed during the information review process, this error was controlled.
7-Variance estimation error and nonsampling: There are two ways to estimate the variance from a sample including: 1-Analytical method: Using standard formulas that evaluate confidence intervals Bootstrap: a very flexible approach that indicates the results could be how close or distant 22,23 . The confidence interval method was used in this study.

Conclusion
Population-based studies were used to estimate the incidence of injury and outcomes of trauma until management knows where attention is needed for preventive programs in this city.
Using indirect sampling techniques can be costeffective and time-consuming, and can provide us with the information we need faster.

Declarations Ethics approval and consent to participate
This study was being approved by ethical committee of Kashan Medical Science University, Kashan, Iran. Code of Ethics (1397,094).
Before starting the interview, the objectives research was described for each of the participants. Also, it has been mentioned that their information would not be cited and will remain confidential to the researcher than. If they agreed to participate in the research, they gave their informed consent. In addition, in this study, it was tried to comply to all human rights permissions.

Consent for publication
For each of the participants, the objectives of research were described. Also, it has been mentioned that their information would not be cited and will remain confidential to the researcher.