Non-Linear Regression Models for Heart Attack Data – An Empirical Comparison

Non linear regression is a popular statistical tool that has been used successfully in different areas of research. The Cox proportional hazard model has achieved widespread use in the analysis of time-to-event data with censoring and covariates. The covariates may change their values over time. A frailty model is a random effects model for time variables, where the random effect (the frailty) has a multiplicative effect on the hazard. It can be used for univariate (independent) failure times, i.e., to describe the influence of unobserved covariates in a proportional hazards model. The aim of this paper is to compare the performance of Cox proportional hazard model, Cox time dependent model and Frailty model using Heart attack data. The result shows that Cox time dependent model is better than other models.


Introduction
Survival analysis is concerned with studying the time between entry to a study and a subsequent event and becomes one of the most important fields in statistics. The techniques developed in survival analysis are now applied in many fields, such as biology (survival time), engineering (failure time), medicine (treatment effect of drugs), quality control (lifetime of components), credit risk modeling in finance (default time of a firm).
The Cox proportional hazard model is now the most widely used for the analysis of survival data in the presence of covariates or prognostic factors. This is the most popular model for survival analysis because of its simplicity, and not being based on any assumptions about the survival distribution. The model assumes that the underlying hazard rate is a function of the independent covariates, but no assumptions are made about the nature or shape of the hazard function. In the last several years, the theoretical basis for the model has been solidified by connecting it to the study of counting processes and martingale theory, which was discussed in the books of Fleming and Harrington (1991) and of Andersen et al (1993). These developments have led to the introduction of several new extensions of the original model. However the Cox proportional hazard model may not be appropriate in many situations and other modifications such as stratified Cox model (Kleinbaum, 1996) or Cox model with time dependent variables (Collett, 2003) can be used for the analysis of survival data.
A frailty model is a random effects model for time variables, where the random effect (the frailty) has a multiplicative effect on the hazard. It can be used for univariate (independent) failure times, i.e., to describe the influence of unobserved covariates in a proportional hazards model. However, is to consider multivariate (dependent) failure times generated as conditionally independent times given the frailty. This approach can be used for survival times for individuals, like twins or family members, and for repeated events for the same individual. The standard assumption is to use a gamma distribution for the frailty, but this is a restriction that implies that the dependence is most important for late events (Hougaard, 1995). The word 'frailty' was introduced by Vaupel et al. (1979) for univariate data. These models are used to explain the deviant behavior of mortality rates at advanced ages (Vaupel and Yashin, 1985), to correct biased estimates of regression coefficients in Cox-type models of hazard rate (Chamberlain,1985) and to separate compositional and bio-logical effects in aging studies (Manton et al., 1986). Frailty models play an important role in the interpretation of the results of stress experiments (Yashin et al., 1996a) and in centenarian studies . The use of survival data on related individuals opens a new avenue in frailty modeling. Genetic variation, heritability, and other properties of individual susceptibility to death can now be analyzed using correlated frailty models (Yashin and Lachine, 1995;Yashin and Lachine, 1997).
In Heart attack studies, the main outcome of interest is the time to occurrence of event like death, relapse etc. Coronary heart disease (CHD) is the leading cause of death world wide (Mackay & Mensah, 2004). Although men have higher rates than women at all ages, and coronary disease occurs up to 10 years later in women (Sharp, 1994), CHD is a major cause of death for both sexes: the World Health Organisation estimates that 3.8 million men and 3.4 million women around the world die from it each year (Mackay & Mensah, 2004). Despite recent improvements, the mortality rate in the UK remains amongst the highest in the world and coronary prevention is a priority (The Scottish Office, 1999;Department of Health, 1999).
The objective of this paper is to compare the performance of Cox proportional hazard model, Cox time dependent model and Frailty model using Heart attack data.

Cox Proportional Hazard Model
It is a mathematical modeling approach for estimating survival curve when considering several explanatory variables simultaneously. It is also called semi parametric model. The proportional hazard model describes the relationship between the hazard function of the risk of an event and a set of covariates. The Cox proportional hazard model is usually written in terms of the hazard model. It is given below as described by Cox (1972) where is baseline hazard and is parameter vector and are independent variables. The above equation (1) reveals that the hazard at time t is the product of two quantities. The first of these, is called the baseline hazard function. The second quantity is the exponential expression. This model gives an expression for the hazard at time t for an individual with a given specification of a set of explanatory variables denoted by . An important feature of equation (1), which concerns the proportional hazard assumption, is that the baseline hazard is a function of t, but doesn't involve the . The baseline hazard function is left unspecified so that the time to event random variable is not assumed to follow any particular distribution and this is one of the essential properties of proportional hazard model (Lee, 1992).

Cox Time Dependent Covariate Model
Time dependent covariates have been studied a number of authors (Crowley & Hu, 1977;Cox & Oakes, 1984;Andersen, 1986;Fisher & Lin 1999). A baseline Cox analysis ignores the change of updated covariate values usually yields smaller effect estimates than a time dependent analysis using all temporal information available (Aydemir et al., 1999). Also Altman and De Starola (1994) called this is the time decay of the effects of entry values. One of the earliest applications of the use of time varying covariates in a biomedical setting may be found in Crowley (1977).
Let denote the value of the covariate measured at time . Let denote the value of the covariate for subject at time The notation in the above equation is completely general in the sense that, if a particular covariate, is fixed thenand this has lead to use the time dependent notation in equation (1) exclusively. The generalization of the proportional hazard regression function to include possibly multiple time varying covariates is and the generalization of the partial likelihood function

Frailty Models
Frailty is an unobserved random factor that modifies multiplicatively the hazard function of an individual or group or cluster of individuals. Vaupel et al. (1979) introduced univariate frailty model (with a gamma distribution) into survival analysis to account for unobserved heterogeneity or missing covariates in the study population. The idea is to suppose that different patients possess different frailties and patients more frail or prone tend to have the event earlier that those who are less frail. The model is represented by the following hazard given the frailty: can be equal to the baseline hazard function (in Cox regression model). The baseline hazard function can be chosen non-parametrically, or parametrically. An important point is that the frailty Z is an unobservable random variable varying over the sample which increases the individual risk if or decreases if . The model can also be represented by its conditional survivor function where The marginal survivor function can be calculated by Elbers and Ridder (1982) proved that frailty model with mean is identifiable with univariate data, when covariates are included in the model. Many distributions can be chosen for the frailty, but the most common frailty distribution with gamma distribution. The Gamma distribution has been widely applied as a mixture distribution (Clayton, 1978;Hougaard, 2000;Oakes, 1982;Vaupel et al., 1979;Yashin etal., 1995). Other distributions which are sometimes applied for the frailty distribution are the well known normal, the log normal (McGilchrist and Aisbett, 1991), the three parameter distribution (PVF) (Hougaard, 1986), and inverse Gaussian distribution. The effect of different frailty distribution is investigated by Congdon (1995). If the value of the frailty is assumed to be constant within groups, the models are called shared frailty models.

Application to Heart attack data
The data obtained from the John Wiley & Sons website, ftp// ftp.wiley.com/public/scitech_med/survival. It may also be obtained from the website for statistical services at the University of Massachusetts at Amherst by going to the data sets link and then the section on survival data http://www.umass.edu/ statdata/statdata. The data from the Worcester Heart Attack Study (WHAS) have been provided by Goldberg (2005) of the Department of Cardiology at the University of Massachusetts Medical School. Data have been collected during 13 oneyear periods (1975)(1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988), on all myocardial infarction (MI) patients admitted to hospital in the Worcester, Massachusetts Standard Metropolitan Statistical Area. Event is coded as 1 and censoring is coded as 0.

Results
The non-linear regression models were fitted using STATA 12 and the results are presented in Table 2