A New Statistical Trend in Medical Research – Nested Design

This research is unlike many others in that the concept of the nested design is applied in the medical science as against the trend in the agricultural and social sciences. In this research, we consider a three-stage (2 × 5 × 2) nested design with the factors being Hospital Centre, Days of the week and Ailments such that the days of the week are nested within the centres and the ailments are nested within the days. The replications represent the weight of twelve (12) patients selected randomly for each day in each centre which brings the total number of replications to 240. This work being largely an illustration uses simulated data. Analyses of variance (ANOVA) for the sum of squares across all factors and within replicates were investigated for significance. Results obtained reveal that the days and ailments are significant factors in the experiment at 5% significant level.


Nested Design
Generally, the design of experiments is a very broad subject concerned with methods of sampling to reduce the variation in an experiment and thereby to acquire a specified quantity of information at minimum cost. For the same total number of observations, some methods of data collection (designs) provide more information concerning specific population parameters than others. No single design is best in acquiring information concerning all types of population parameters. Indeed, the problem of finding the best design for focusing information on a specific population parameter has been solved in only a few specific cases. Suppose that we wish to compare five teaching techniques, A, B, C, D, and E, and that we use 125 students in the study. The objective is to compare the mean scores on a standardized test for students taught by each of the five methods. There are likely to be boys and girls in the group, and the methods might not be equally effective for both genders. There are likely to be differences in the native abilities of the students in the group, resulting in some students performing better regardless of the teaching method used. Different students may come from families that place different emphases on education, and this could have an impact on the scores on the standardized test. In addition, there may be other differences among the 125 students that would have an unanticipated effect on the test scores. Based on these considerations, we decide that it might be wise to randomly assign 25 students to each of five groups. Each group will be taught using one of the techniques under study. The random division of the students into the five groups achieves two objectives. First, we eliminate the possible biasing effect of individual characteristics of the students on the measurements that we make. Second, it provides a probabilistic basis for the selection of the sample that permits the statistician to calculate probabilities associated with the observations in the sample and to use these probabilities in making inferences. This experiment involves a single factor -namely, method of teaching. In this experiment, the factor has five levels: A, B, C, D, and E. The primary aim of designing experiments is to increase accuracy and to control for extraneous sources of variation. The analysis of data generated by a multivariable experiment requires identification of the independent variables in the experiment. Most experiments involve a study of the effect of one or more independent variables on a response. Independent variables that can be controlled in an experiment are called factors, and the intensity level of a factor is called its level. Treatments correspond to combinations of factor levels and identify the different populations of interest to the experimenter. There are three important (basic) designs in the analysis of variance: Completely randomized design, Randomized complete block design and Latin square. However, some others based on these three are factorial design and split-plot design, and nested design. Nested designs, also known as hierarchical designs, are covered in [11,17,18,20]. According to Sokal and Rohlf , nested designs usually contain random effects and are called Model II ANOVA [18]. Nested designs are utilized when there are samples within samples. In horticulture, for instance, an investigator should need to think about the transpiration rates of five cross breeds of specific types of plant. For every hybrid, six plants are developed in three pots, two plants for every pot. At the end of the growth time frame, transpiration is estimated on four leaves of each plant. We in this way have leaves settled inside plants settled inside pots settled inside half breeds.

Anthropometry
Anthropometry is the science of measuring human body as to height, weight and size of component parts including skin fold thickness, to study and compare the relative proportions under normal and abnormal conditions. Different types of anthropometric studies have been applied in several areas with some interesting results eg Medicine, Artistry, Fashion designing etc. Baddarudoza et.al applied it in the comparison of female youth aged 19-24years in Amrister city where some anthropometric variables like height, waist weight etc were measured [2]. The results indicated a systematic indicator of linear effect and positive correlation of age and other anthropometric variables with blood pressure. Rea et al in an anthropometry study discovered a significant mean weight difference was found to exist in male, 63.9 (s.d=9.1) kg compared to the female subject with 54.3(11.9) kg with P < 0.0001. Men were also significantly taller than women with mean height of 162(5.9) cm compared to 150(6.7) cm with P <0.0001 [14]. Kevai, K and Tanuj, K pointed out some errors researchers make pertaining the use of anthropometry in identification of individuals [6]. They further emphasized on the need of essential methodological issues in taking measurements in anthropometric studies used in establishing forensic standards. Anthropometry is an important field in medicine. Hearne suggests anthropometrics could be effectively applied to develop ergonomically sound medical devices [5]. Nevin Utkualp and Ilker Ercan (2015), enumerated several importance of anthropometry in medical science they include, evaluation of morbidities in individuals, developing imaging technique, identification of disproportion in organs (mostly facial organs) for corrective (face) plastic surgery, aesthetic dermatology (anti-wrinkle treatment),breast cosmetics etc [19].

Nested Anova and Medical Science
Nested design has shown relevance in the field of medicine as far back as 3 decades ago, In 1990 Anderson discusses the misleading effect of using a one-way anova in medical science instead of nested anova when nested anova is more appropriate in his classical compilation of methodological errors in medical research [1]. Kroodma et al (2001) advocated the use nested anova to avoid problem of pseudo replication in analysing of playback experiments [7]. In 2008 German et al. augustly applied the nested design to conquer the prevalent issue of pseudo replication leading to wrong degree of freedom in the measuring of Electromyographic (EMG) signal [4]. Prominent biological textbook such as biological Design And Analysis Logan (2010) [8], Experimental Design and Data Analysis for Biologists Quinn and Keough (2002) [12], dedicated chapter(s) of their text to extensively discuss the importance of nested design in biological science. Because of the idiosyncrasies of experiment in medical field where there are rarely interaction of factors, therefore nested design is becoming a very passable method for experimental design in these field. For example, St Thomas college on their webpage for biology gave an example of a typical experiment prevalent in the medical field; A case of comparing the rate of success of heart surgery in three hospital. They used a situation of a hypothetical clinical trial comparing the success of heart surgery at four different hospitals, each with four different surgeons that operate on a unique set of patients. In this study there are two independent variables (hospital and surgeon), but each surgeon carries out operations only in one of the four hospitals. The surgeon variable is said to be "nested" within the hospital variable. If every surgeon had performed operations at all four hospitals the surgeon variable would be said to be "crossed" with the hospital variable and a two-factor ANOVA would be appropriate. They also illustrated how these analysis could be done using statistical software (JMP).
In recent year's nested anova have gained an extensive usage in medicine, Chis Robert et al (2015) [15], used partial nested design to evaluate binary outcome in clinical trials. Robert T et al (2017) [16] used a two-way nested anova to show how wavelet can be employed to mammogram for better diagnosis of breast cancer. Berzosa et al (2018), used nested design to investigate and determine which malaria diagnostic method from microscopy, Rapid diagnostic test (RTDs) and Polymerase chain reaction (PCR) is more efficient in Equatorial Guinea [3]. Ramachandran et al (2018), used nested anova as part of the analysis required in investigating the effect of Cyclophilin A (secretory protein) in atherosclerosis (a blood vessel illness) [13].

Method
Research Question / Illustration: A hospital in Nigeria having different branches across the country wishes to analyze the effect of different ailments on the weight of patients. Using two of her branches in Owerri and Lagos, the management uses the patients visiting the consultancy room in one week as a case study. To further allow for homogeneity and treat the variability due to ailments, they decide to schedule different days of the week to handle specific ailment-related complaints. In particular, they assign two ailments per day for the week. As a statistician, carefully design the experiment for the management and conclude based on your findings. A 2 × 5 × 2 nested design was constructed using the simulated data on Table 1, and we adopted the method of Analysis of Variance (ANOVA). (2: Owerri Centre and Lagos Centre; 5 Days of the Week: 2 Ailments and weight of 12 patients). Data is analyzed using Minitab statistical package.
The design for our experiment is an example of a three-stage nested design. The factor in the first stage is Hospital Centre. The nested factor in the second stage is Days of the Week within the Hospital Centre (denoted Days(Centre)). The nested factor in the third stage is Ailments within Days (denoted Ailments(Days)).

Model Specification:
The balanced three-stage nested design with factors A, B(A) and C(B) is thus: a = number of levels of factor A b = number of levels of factor B within the ith level of factor A c = number of levels of factor C within the jth level of factor B n = number of replicates for the kth level of B within the ith level of A  The model for a three-stage nested design is given as ; 1, 2, . . . , a j = 1, 2 . . . , b k = 1, 2, . . . , c l = 1, 2, . . . , n where: µ is the overall mean; α i i is the ith factor A effect;β j (i) is the jth effect of factor B nested within the ith level of factor A; γ K(ij) is the kth effect of factor C nested within the jth level of factor B; l(ijk) is the random error of the lth observation from the kth level of C within the jth level of B within the ith level of A. Assumptions:

Software Output
Nested ANOVA: Weight versus Centre, Days, Ailment  * Value is negative, and is estimated by zero.

Discussion of Result.
The table for nested anova is used to show which factor(s) is (are) significant. A significant factor is a factor who contribute predominantly to the outcome of the experiment. The f column on the table gives the corresponding f-ratio for each of the factor is gotten by dividing each Mean Sum of Square (MSE) by the MSE bellow it. The higher the f-ratio is either compared with the f-tab or the p-values for the f-ratio are computed. As can be seen above, Minitab uses p-value to obtain significant factors. P-Values less than 0.05 indicate significant factors. Since our =95%. Factor Centre with p-value 0.402 is not significant because 0.402<0.05, while Days (centre) and Ailment (days) with p-value 0.004 and 0.015 are significant because they are less than 0.05.
The The columns corresponds to the variance components, identified across the top. Each expected mean square includes the residuals variance with a co-efficient of one. They are used for ascertaining which statistic should appear in the denominator in an F-test for testing a null hypothesis that a particular effect is absent. All this can be clearly seen in the expected mean square table.

Conclusions
This study substantiate the diverse applications and importance of nested design. Against normal trend, the concept of nested design has been illustratively applied to the field of medical science in the study of anthropometry. In the given illustration, source of variability in the weight (anthropometry) of patients were considered using three factors namely; centre, day (centre) and ailment (day) which resulted in a three stage nested design. The analysis of the resulting design showed from the ANOVA table that the test of the variance component for Days (Centre) and Ailments (Days) are significant with p-values of 0.004 and 0.015 respectively. The variance component estimates are; σ 2 = 85.744,σ 2 Days = 45.416 , σ 2 Ailment = 9.073. From the estimates of the variance components, replication is the highest source of variability accounting for about 61% of the total variability. Days and Ailments accounts for 32% and 7% respectively. Hence this has shown the successful application of nested design in medical field.