Comparison between Robust and Classical Analysis in Bivariate Logistic for Medical Data

Representing medical data and biological important part in experiments are concerned with Human life, the primary objective of this research is to use the statistical optimization method analysis for the data and knowledge of the important factors affecting the variables of the study (liver fat, liver size), where the variables are interconnected there is a need for statistical method to examines the degree of their relationship, we used bivariate logistic. To achieve the of the research on the field study will be done in Al-Sadr medical city in the province of Najaf by taking a sample of 150 people auditors diabetes and liver disease center, from the statistical analysis results we observed the degree of diagnosis model in both method are good, and also we monitored that impact factors in responses (liver fat, liver size) and some comment as multivariate logistic in the Future. Journal of Biometrics & Biostatistics J o u rn al of Bio metrics & Bistatis t i c s


Introduction
In the medical and biological field studies, the experiments are often related to the nature of the response adopted for non-continuous variable data (variables), but is the occurrence/non-occurrence of the score after taking a certain treatment (having effect or non-effect), regardless of the nature of the variables of the study at the left side of the General model equation y=xβ + ε, whether continuous, discrete or categorical.
Where the method of analysis depending on the type of data at the right side (y). If the binary response (0,1 depends on binary logit the method while ordinal response if rank (1 st , 2 nd , 3 rd , ...), for example, the degree of healing of an illness or the degree of incidence of a particular disease is formulated into (generalized) ordinal logit/probit regressions, but if the response takes symbolic letters (A, B, ....) can rely on logit (probit) multinomial From a historical of view Barry W [1] studied binary data analysis with bivariate response under the influence of some independent variables and assuming a correlation between paired observations a disturbance depending on the Logistic regression model to estimate parameters the results were obtained are efficient compared with MLE to other researchers.
Kimberlee G, et al. [2] studied the analysis of correlated binary outcomes using Multivariate Logistic Regression for the case of two outcomes, a form of the cumulative bivariate logistic distribution proposed by Gumbel is used to characterize their joint probabilities in terms of logistic marginal probabilities and the correlation coefficient of the responses. They applied this technique in two different situations. When the correlation among responses is not significantly/and is significantly from zero.
In 1997 Sean M., David B [3] studied Bayesian analyses of multivariate binary (categorical) outcomes for ; where β is a vector of unknown regression coefficients with prior Normal distribution .
Thomas Y [4] studied Bivariate Binomial Responses by vgam family Functions B= breathlessness, W=wheeze; (B=i, W=j; i, j=0 ,1), but Hun, M. in 2009 studied the Regression Models for binary dependent variables and the analysis data by Using Stata, SAS etc., He used data from clinicians and practitioners simulation study two responses (trust 1 respondent,0 otherwise, www internet used 1 respondent, 0 otherwise) and five independents variables.
In 2014 Tabatabai, MA [5] and others had studied methods for robust Logistic and probit compared with MLE when there are outlier values, they have been rely on real data and simulation experience for (x i =1,2) as independent variables, they proved robust method is efficient. but they didn't apply bivariate logistic of response.

Logistic distribution
In this part of paper, we display some basic concepts of application distribution Logistic experiments in which the data to the variable appears to stop responding (adopted) is continuous or binary data such as nominal or countable (classified), which does not require a well-known hypotheses linear regression model and there should be no outliers in the data, logistic regression assumes linearity of independent variables and log odds. Only it requires quite large sample sizes, because of maximum likelihood estimate require classification according to the identically bivariate standard normal as pdf and the research relies on statistical estimation methods such as bivariate OLS or Robust multivariate as M_estimator, R_estimator [8] but as a result of developments in the software can be relied upon one of these software (Stat) including several functions explains us in the application part [9,10].

Statistical Techniques
It was used set of metrics (standards) to determine the statistical model appropriate to the data as well as to test the effect of the independent variables dependent on factors such as: [1,9] Likelihood-ratio chi-square test statistics or Wald chi-square test and P-values; Akaike's Information Criteria (AIC) and Bayesian information criteria (BIC); Parameter estimates and standard errors of the study/exposure variable.
Some tables about spread phenomena (fatty liver, increased liver size) according to Gender and Age classes.

Application
Been relying on a random sample data of medical experience in the field of pathological analyzes conducted by researcher [11] in Al-Sadr Teaching Hospital, the province of Najaf on the study the relationship between blood variables and liver disease(fat & size) on the human where selected a random sample of 150 patients, the study variables:

Dependent variables
Fatty_type -the emergence of the fatty degree on the liver (0 lack of fat,1 the presence of fat).
Liver_type -increase in the Liver size (0 normal, 1 an increase in size).
The program was adopted by the statistical software (Stata ver.2014) through analysis Bivariate Binary Logistic Regression, because the response variables in the experiment are binary numbers (1,0), taking into account the lack of independence of the response variables and can be continued to reach the method of implementation through the following illustrative screen [12] (Figure 1), The data were analyzed according to the following cases: Classical method: Do not use Robust analysis we got the following results Through the results shown in the Table 1 above it is clear to display the following: The test value (Likelihood Ratio LR) show an appropriate model used for analysis ( 2= 58.34 with P<0.0001) as well as the existence of a positive relationship between the variables of the study accredited number of response variables (the number of dependent variables) classified to [6,7]:

Binary regression model
This model depends on the following equation: After a series of mathematical operations we get the following formula of binary multiple regression: We see that the logit of the probability of an event given X is a linear function and then MLE or OLS can be applied to complete inference of parameters β.

Bivariate logistic distribution
This model shows of analysis Logistic in medical studies and biological, for example, eye tests when; one where the response may vary from other response (uncorrelated) or possible interrelated as well as responding in agricultural experiments when two plants in the piece tested one of our show bilateral response, that is, and the values of this vectors are as: We can use bivariate probit regression models and these models have two equations for two binary dependent variables as the following equations:  There is effect of some of the independent variables on the dependent variable (Fatty_type) where he showed variable (BMI) very high significant with p_value 0.009, enzyme liver (Got) with probability (p = 0.031) and the degree of a simple effect of sugar control (Hb1Ac) with p_value 0.07, while significant effect of other variables did not show.
The existence of the impact of some independent variables on the dependent variable (Liver_type) where it showed enzyme liver (Got) high impact on increasing the liver size with p_valuep=0.005, fat cholesterol valued likely would (p=0.013) while the age factor is effect with p_value= 0.046, while effect of other variables did not significant.

Robust method:
We chosen robust method of analysis to determine the effect of outlier values on the independent variables which assumed some of the values (10%) of them, because the dependent variables are binary data (0,1), the results showed in the following Table 2: The test value (Likelihood Ratio LR) show an appropriate model used for for analysis (χ 2 = 71.71 with p <0.0001).
The results in Robust analysis are not different without Robust, for (Fatty_type), where the variable (BMI) has very high effect with    p_value==0.009, enzyme liver (Got) with p_value=0.024, but simple effect of sugar control(Hb1Ac) with p_value=0.068, while no significant effect of other independent variables.
The existence of the impact of some of the independent variables on the dependent variable (Liver_type) as variable enzyme liver (Got) high impact on increasing the liver size with p_valuep=0.002 and fat cholesterol is significant with p_value=0.016, while the age factor influential degree p_value=0.031, while significant effect of other variables did not show.
Additional statistical tables are needed to show the numbers of the spread of fatty liver and liver size of the study sample according to social standards (Gender, Age group).
Through the above Table 3 it is clear to us that: Impact of the explanatory variables (BMI, Age, Hb1Ac, Got, Cho) is approved on liver fatty changes and increased liver size, while do not receive the influence of other explanatory variables on these responses.
Spread of fatty liver disease in females than in males according to the research sample was more prevalent in the age group (40-55 years) while the prevalence of liver size ratio in the age group (55-70 years), a medically acceptable because Category Previous showed the spread of fatty liver disease where after years leads to an increase (inflation) size of the liver.

Recommendations
Use simulations to test the success of the methods used under several influences (errors distribution type, sample size and assuming a high correlation between the two variables of response values).
More studies are required to study the analysis of logistics multivariable models.