Biological clustering supports both “Dutch” and “British” hypotheses of asthma and chronic obstructive pulmonary disease

Background Asthma and chronic obstructive pulmonary disease (COPD) are heterogeneous diseases. Objective We sought to determine, in terms of their sputum cellular and mediator profiles, the extent to which they represent distinct or overlapping conditions supporting either the “British” or “Dutch” hypotheses of airway disease pathogenesis. Methods We compared the clinical and physiological characteristics and sputum mediators between 86 subjects with severe asthma and 75 with moderate-to-severe COPD. Biological subgroups were determined using factor and cluster analyses on 18 sputum cytokines. The subgroups were validated on independent severe asthma (n = 166) and COPD (n = 58) cohorts. Two techniques were used to assign the validation subjects to subgroups: linear discriminant analysis, or the best identified discriminator (single cytokine) in combination with subject disease status (asthma or COPD). Results Discriminant analysis distinguished severe asthma from COPD completely using a combination of clinical and biological variables. Factor and cluster analyses of the sputum cytokine profiles revealed 3 biological clusters: cluster 1: asthma predominant, eosinophilic, high TH2 cytokines; cluster 2: asthma and COPD overlap, neutrophilic; cluster 3: COPD predominant, mixed eosinophilic and neutrophilic. Validation subjects were classified into 3 subgroups using discriminant analysis, or disease status with a binary assessment of sputum IL-1β expression. Sputum cellular and cytokine profiles of the validation subgroups were similar to the subgroups from the test study. Conclusions Sputum cytokine profiling can determine distinct and overlapping groups of subjects with asthma and COPD, supporting both the British and Dutch hypotheses. These findings may contribute to improved patient classification to enable stratified medicine.

Asthma and chronic obstructive pulmonary disease (COPD) cause considerable morbidity and consume substantial health care resources. 1,2 Both conditions are characterized by airflow obstruction, which is typically variable and reversible in asthma but fixed in COPD. However, there is overlap; in severe asthma, there can be persistent airflow obstruction and partially reversible airflow obstruction in COPD. Likewise although some reports have suggested that there are marked differences in patterns of the underlying inflammation, 3 cellular mechanisms, inflammatory mediators, and response to therapy 4 between asthma and COPD, others have demonstrated considerable heterogeneity in severe asthma [5][6][7] and COPD [8][9][10][11] with overlap between the conditions. [12][13][14] Indeed, there is an ongoing debate between the ''Dutch hypothesis,'' which proposes that asthma and COPD are manifestations of the same basic disease process, and the ''British hypothesis,'' which suggests that asthma and COPD are distinct entities generated by different mechanisms. 15 The need to refocus efforts to define the similarities and differences in asthma, particularly in those with severe disease, and COPD in terms of cytokine profiles 16 is underscored by the emergence of highly specific anti-inflammatory therapies because response is more likely to be phenotype rather than diseasespecific. 17 This is perhaps best exemplified by anti-IL-5 approaches, which have demonstrated clinical responses related to underlying eosinophilic lung inflammation in asthma 18,19 and Abbreviations used COPD: Chronic obstructive pulmonary disease ROC: Receiver operating characteristic ROC AUC: Area under the receiver operating characteristic curve similar strategies are currently being tested in COPD. To enable these and further analogous developments, there is an urgent need to define the airway cytokine profiles in asthma and COPD. We hypothesized that there are distinct sputum cytokine profiles that are COPD and asthma specific and another that represents that asthma and COPD overlap.  (32) .46

METHODS Subjects
Subjects with severe asthma or moderate-to-severe COPD were recruited from a single center at the Glenfield Hospital, Leicester, United Kingdom, into independent test and validation studies. Assignment to asthma or COPD was made by the subjects' physician consistent with definitions of asthma and COPD according to the Global Initiative for Asthma 1 or the Global Initiative for Chronic Obstructive Lung Disease 2 guidelines, respectively, for both the test and validation groups. All subjects were assessed at stable visits at least 8 weeks free from an exacerbation, defined as an increase in symptoms necessitating a course of oral corticosteroids and/or antibiotic therapy. The subjects with COPD had participated in an exacerbation study, 20,21 and some of the subjects with asthma had participated in an earlier study. 22 All subjects provided written informed consent, and the studies were approved by the local Leicestershire, Northamptonshire, and Rutland ethics committee.

Measurements
Demographic, clinical, and lung-function data were recorded including preand postbronchodilator FEV 1 , forced vital capacity, and symptom scores using the visual analogue scale. Spontaneous or induced sputum was collected for sputum total and differential cell counts and bacteriology; cell-free sputum supernatant was used for mediator assessment as described previously. 20 Sputum was produced spontaneously in 93% of the subjects. Positive bacterial colonization was defined as colony-forming units greater than 10 7 /mL sputum or positive culture. 20,21 Subjects with sputum eosinophil and neutrophil differential cell counts above 3% 23,24 and 61% 25 were defined as eosinophilic or neutrophilic, respectively. Further stratification of the subjects into 4 subgroups on the basis of their sputum cell counts was also done: pure eosinophilic (eosinophil > 3% and neutrophil < 61%), pure neutrophilic (eosinophil < 3% and neutrophil > 61%), mixed granulocytic (eosinophil > 3% and neutrophil > 61%), and paucigranulocytic (eosinophil < 3% and neutrophil < 61%). Inflammatory mediators were measured in sputum supernatants using the Meso Scale Discovery Platform (MSD; Gaithersburg, Md). The mediators measured were selected to reflect cytokines, chemokines, and proinflammatory mediators implicated in airway disease. The performance of the MSD platform in terms of recovery of spiked exogenous recombinant proteins has been described previously. 16 Sputum inflammatory mediators that were below the detectable range were replaced with their corresponding lower limit of detection in subjects with both asthma and COPD. Twenty-one mediators were included in the test study, and 14 mediators were available in the validation study.

Statistical analysis
See this article's Online Repository at www.jacionline.org for detailed statistical methods. All statistical analyses were performed using STATA/IC version 13.0 for Windows (StataCorp, College Station, Tex) and R version 2.15.1 (R Foundation for statistical computing, Vienna, Austria). Parametric data were presented as mean with SEM, and log transformed data were presented as geometric mean with 95% CI. The x 2 test or the Fisher exact test was used to compare proportions, and 1-way ANOVA was used to compare means across multiple groups; nonparametric data were presented as median with first and third quartiles, and Kruskal-Wallis test was used to compare subgroups. Inflammatory mediators that significantly discriminated across asthma versus COPD and bacterially colonized versus noncolonized were identified using receiver operating characteristic (ROC) curves. Factor analysis was performed on sputum inflammatory mediators, and independent factor scores were derived and used as input variables in the k-means cluster analysis to identify subjects' biological subgroups. The optimal number of clusters was chosen on the basis of a scree plot, by plotting within-cluster sum of the squares against a series of sequential numbers of clusters. Linear discriminant analysis was performed and a classification model developed from the test study for the validation study. In addition, classification and regression trees analysis was performed sequentially on all inflammatory mediators in the test study that had high discriminant function to identify possibly clinically relevant cutoff points. The inflammatory mediator cutoff points with the highest sensitivity ratio in discriminating the clusters together with subject disease status (asthma or COPD) were applied to classify the validation study into subgroups. A P value of less than .05 was taken as statistically significant.

RESULTS
The clinical and sputum characteristics of the asthma (n 5 86) and COPD (n 5 75) test groups are presented in Table I. Subjects with asthma were younger, had a higher body mass index, better lung function, fewer symptoms, and a lower smoking pack-year history than did subjects with COPD. The differential neutrophil and eosinophil counts were not statistically different between the groups, but the total cell count was higher in those with COPD. The sputum inflammatory mediator profiles were distinct with increased T H 2 (IL-5, IL-13, and CCL26) and T H 1 mediators (CXCL10 and 11) in severe asthma compared with COPD and increased IL-6, CCL2, CCL3, and CCL4 in COPD compared with severe asthma. Inflammatory mediators that best discriminated asthma and COPD are presented as ROC curves (see Fig  E1 in this article's Online Repository at www.jacionline.org). Sputum CCL5 and CXCL11 levels were substantially higher in subjects with asthma than in subjects with COPD, with area under the ROC curves (ROC AUCs) of 0.74 (95% CI, 0.67-0.82; P < .0001) and 0.72 (95% CI, 0.64-0.80; P < .0001), respectively. Sputum IL-6 and CCL2 levels were significantly higher in subjects with COPD than in subjects with asthma, with ROC AUCs of 0.86 (95% CI, 0.81-0.92; P < .0001) and 0.69 (0.61-0.77; P < .0001), respectively. Discriminant analysis using the combined clinical, physiological, and biological (inflammatory mediator) variables completely distinguished the asthma and COPD groups (Fig 1).
The mediators that best discriminated between bacterially colonized and noncolonized subjects were sputum IL-1b and TNF-a, with ROC AUCs of 0.76 (95% CI, 0.68-0.85) and 0.75 (95% CI, 0.66-0.84), respectively (see Fig E2 in this article's Online Repository at www.jacionline.org). Factor analysis revealed 4 factors with IL-1b, IL-5, IL-6, and CXCL11 as the highest loading components, respectively, across the 4 factors (see Table E1 in this article's Online Repository at www. jacionline.org). Subsequent cluster analysis identified 3 clusters (Table II). Individual clinical and biological comparisons of subjects with asthma and COPD in clusters 1, 2, and 3 are presented in Table E2 in this article's Online Repository at www.jacionline.org. Linear discriminant analysis was performed to verify the determined clusters and to identify the contribution of inflammatory mediators in discriminating the clusters. Subsequently, 2 discriminant scores for individual subjects were calculated and used to represent the clusters in a 2-dimensional graph (Fig 2).
Cluster 1 consisted of mainly subjects with asthma (95% of cluster 1) with elevated sputum T H 2 mediators and was eosinophil predominant, with 67% of the subjects having a sputum eosinophilia and 48% a sputum neutrophilia. Further stratification of cluster 1 by sputum cell counts showed that the subjects were 40% pure eosinophilic, 21% pure neutrophilic, 27% mixed granulocytic, and 12% paucigranulocytic. Cluster 2 consisted of an overlap of subjects with asthma and COPD with sputum neutrophil predominance (75% of subjects with asthma and 95% of subjects with COPD). In contrast, only 11% and 5% of the subjects with asthma or COPD, respectively, had a sputum eosinophilia. In addition, there were elevated sputum levels of IL-1b, IL-8, IL-10 and TNF-a and bacterial colonization. The increased rate of bacterial colonization found in this cluster was driven predominately by subjects with COPD (Table E2). Further stratification of cluster 2 showed that the subjects were 0% pure eosinophilic, 74% pure neutrophilic, 9% mixed granulocytic, and 17% paucigranulocytic.
Cluster 3 consisted predominantly of subjects with COPD (95% of cluster 3). In contrast to subjects with COPD in cluster 2, neutrophilic inflammation was present in only 49% of the subjects whereas a sputum eosinophilia was observed in 46% of the subjects. IL-6 and CCL2 levels were increased compared with those in clusters 1 and 2 but were similar to those in subjects with COPD in cluster 2. Only CCL13 and CCL17 were elevated in subjects with COPD in cluster 3 compared with subjects with COPD in cluster 2 (Table E2). Further stratification of cluster 3 showed that the subjects were 21% pure eosinophilic, 28% pure neutrophilic, 23% mixed granulocytic, and 28% paucigranulocytic. The proportion of subjects with asthma with airflow obstruction in the 3 clusters was not significantly different. Of the 2 subjects with asthma in cluster 3, 1 had persistent airflow obstruction compared with 17 of 55 in cluster 1 and 10 of 28 in cluster 2.
The best discriminator between subjects in clusters 1 or 3 compared with the overlap group cluster 2 was sputum IL-1b at a cutoff point of 130 pg/mL (Fig 3). The second best discriminator was TNF-a with a cutoff point of 5 pg/mL (see Fig E3 in this article's Online Repository at www.jacionline.org).

Validation
These cluster analysis findings were then validated in independent asthma and COPD cohorts. Subjects were assigned into subgroups using 2 techniques. The first was a classification model developed from the test cohort using linear discriminant analysis and betas for each cluster and cytokines were extracted (see Table  E3 in this article's Online Repository at www.jacionline.org). Individual subject discriminant score in each subgroup was calculated and the subject was assigned to the subgroup in which he or she had the highest score. The second technique used the IL-1b cutoff point at 130 pg/mL, which was identified as the best classifier to distinguish overlap cluster 2 from clusters 1 or 3 in the test study and was used alongside subject disease status (asthma or COPD). The sputum cellular and inflammatory mediator profiles of the 3 validation study subgroups, obtained using both techniques, were very similar to the test subgroups (Tables  III and IV; Fig 4). In addition, individual clinical and biological comparisons of subjects with asthma and COPD in validation subgroups, presented in this article's Online Repository in Tables E4  and E5 at www.jacionline.org, revealed a patter similar to that of test subgroups (Table E2).

DISCUSSION
Here we report that although a combination of clinical variables distinguished asthma from COPD, further analyses of the sputum inflammatory mediators revealed that patients with asthma and COPD were best described by 3 biological clusters incorporating clinical, physiological, and inflammatory mediator characteristics. Our findings have further underscored the complex heterogeneity of asthma and COPD and provided support for the ''British'' hypothesis of airway disease pathogenesis as we identified 2 clusters that were predominately either asthma or COPD with distinct cytokine profiles, while also supporting the ''Dutch'' hypothesis by identifying a third cluster of overlapping subjects from both disease groups with similar cytokine profiles. Cluster 1 was asthma predominant with evidence of eosinophilic inflammation and increased T H 2 inflammatory mediators. Cluster 2 contained an asthma and COPD overlap group, with  J ALLERGY CLIN IMMUNOL VOLUME 135, NUMBER 1 predominately neutrophilic airway inflammation and elevated levels of IL-1b and TNF-a in addition to being assigned the highest proportion of subjects with bacterial colonization. Cluster 3 was a COPD-predominant group with mixed granulocytic airway inflammation and high sputum IL-6 and CCL13 levels. Furthermore, the biological clusters derived from the test group could be validated in an independent group yielding similar inflammatory mediator profiles to the test group. Whether these biological clusters can be used to stratify subjects for more targeted approaches to novel and existing therapies needs to be further studied.
The clusters we have identified have biological plausibility and they confirm and extend our current understanding of the immunopathobiology of asthma and COPD, moving our understanding beyond previous comparisons of asthma versus COPD 16 or clustering approaches of cytokine profiles in asthma or COPD alone. 20,26 In addition, the clusters might represent groups with possible stratified responses to specific anti-inflammatory treatment. Cluster 1 is consistent with the T H 2-predominant eosinophilic asthma paradigm. Indeed, this group was predominately asthmatic but importantly also included about 5% of subjects with COPD. It would seem likely that this group is most likely to respond to anti-T H 2 cytokine therapy such as anti-IL-5 and 13. 18,19,27,28 Eosinophilic COPD is well-described, and this group has a greater response to oral and inhaled corticosteroids than did those with noneosinophilic COPD. 29,30 Whether subjects with COPD in this cluster would respond to anti-T H 2 cytokine therapy is currently under study (www.clinicaltrials.gov NCT01227278). Cluster 2 included an overlap of subjects with asthma and COPD. This group was predominately neutrophilic, consistent with previous observations, 14 and with increased bacterial colonization. Recent evidence supports the role for macrolide antibiotics in COPD 31 and in noneosinophilic severe asthma. 32 Antineutrophilic strategies such as anti-CXCR2 are currently under study. 33 Further studies are required to assess whether this cluster represents patients most likely to respond to these therapies. In cluster 2, increased bacterial colonization was evident, particularly in those with COPD, perhaps suggesting that in these subjects the neutrophilic inflammation is a consequence of bacterial colonization rather than the primary abnormality. Thus, whether ameliorating neutrophilic inflammation in this group is beneficial or harmful is unclear. Indeed, lessons from anti-TNF-a therapy suggest that targeting proinflammatory cytokines can increase the risk of infection. 34,35 In contrast, in those with neutrophilic inflammation without evidence of bacterial colonization, particularly in those with asthma, the neutrophilic inflammation might be critical in the development of the disease. Thus, identification of distinct groups that benefit or are harmed by antineutrophilic approaches would enable better stratification of such therapies. Cluster 3 included mainly subjects with COPD in which bacterial colonization was observed in fewer subjects in spite of consistently elevated proinflammatory cytokines. Perhaps this group, in contrast to cluster 2, represents subjects in which the proinflammatory environment plays a more causal role in the disease expression rather than as a consequence of infection. This might suggest that this group would be more amenable to anticytokine therapies such as anti-IL-6. In addition, eosinophilic inflammation was a feature in some subjects in cluster 3 in the absence of an elevated T H 2 profile. One of the few cytokines increased in cluster 3 was CCL13, which is a CCR3 agonist and promotes eosinophil migration. Small airway macrophages are an important source of CCL13 in the airway and might play a role in the eosinophilic inflammation in this group. 36 Taken together, these intriguing and novel observations immediately open up opportunities for further translational studies to determine the underlying mechanisms of these clusters and their treatment-specific antiinflammatory therapies.
In addition to clear differences in the cytokine profiles between groups, there were several differences in clinical parameters. These were largely dependent on whether the clusters were asthma or COPD predominant or mixed. For example, lung function, age, greater smoking history, and higher exacerbation frequency was related to the number of subjects with COPD in the cluster. However, the symptom of cough was more common in cluster 2. Indeed, subjects with asthma and COPD in cluster 2 had a higher visual analog scale score for cough than did either the subjects with asthma in cluster 1 or the subjects with COPD in cluster 3, respectively. This suggests that this difference is independent of disease status and might represent a real association either between the inflammatory profile or increased bacterial colonization in cluster 2. This cluster also had the highest sputum total inflammatory cell count, suggesting that this represents a ''chronic bronchitis'' group. As suggested above, whether this group might warrant antimicrobial, anti-inflammatory, or antimucolytic therapy is an interesting possibility.
One of the strengths of our observations is that we were able to support the identification of the 3 biological clusters in an independent validation group. The similarity between the cytokine (inflammatory mediators) profiles in test and validation groups supports the view that each cluster is a consistent phenotype and might reflect common immunopathology and phenotype-specific responses to treatment. We found biological clusters that were asthma or COPD predominant, suggesting that there are distinct mechanisms underlying these groups, but we also identified a consistent overlap group that might be a consequence of shared mechanisms. Two approaches were used to validate the clusters in an independent group using discriminant analysis and the generation of a classifier that used the disease allocation and sputum IL1b cutoff. Sputum IL-1b was the best discriminator between the subjects with asthma or COPD in clusters 1 and 3, respectively, with those in the overlap group cluster 2. The clinical diagnosis of asthma or COPD together with a single sputum cytokine (IL-1b cutoff) demonstrated a simple approach to segment asthma and COPD populations into 3 groups with distinct and consistent cytokine profiles. This approach has advantages in its simplicity and offers the potential for immediate use in stratified medicine studies although it might underestimate small, albeit potentially important subgroups such as T H 2 high COPD. . Circles indicate test study, triangles indicate validation using linear discriminant analysis, and rectangles indicate validation using IL-1b cutoff at 130 pg/mL and disease status (asthma or COPD). The y-axes depict the mean z value (standardized) of each cytokine in each test and validation subgroup.
One possible limitation of this study is that only subjects with severe asthma and COPD who attended a secondary care setting were included, and thus might not be representative of a more generalized population. We concede that our findings cannot be extrapolated to mild to moderate asthma or mild COPD but are confident that our test and validation populations are representative of our broader secondary care patient population. Our earlier preliminary data comparing asthma and COPD included subjects across the severity of disease and in this analysis fewer differences between asthma and COPD were observed. 16 Whether this was due to lack of power because of the small numbers or due to masking clearer differences in more severe disease is unknown. Further studies are required to include healthy controls, larger disease populations including a broader spectrum of subjects including those with mild disease, and comparisons with other disease control groups. Allergic sensitization might also be an important mechanism in driving the different clusters. We did not record atopic status in the COPD group consistently, but in those with asthma there was no difference across the clusters. However, future studies should consider the role of allergy in these clusters. Our study has focused on stable visits and a similar comparison is required for longitudinal follow-up at stable and exacerbation events. We have previously reported exacerbation biological clusters in subjects with COPD and interestingly have identified similar profiles as described here. 20 Whether comparisons of cytokine profiles in larger groups of subjects with severe asthma and COPD reveal similar biological clusters needs to be addressed. The cytokine profiles have been derived from sputum analysis and whether the profiles are similar in tissue samples is unknown. Access to bronchoscopic samples from large numbers of subjects with COPD and asthma with severe disease is challenging, but multicenter efforts to address this are underway in parallel with sputum sampling and these findings are eagerly awaited. In addition, although we have chosen to measure a large number of mediators implicated in obstructive airway disease, these mediators cannot fully reflect the complexity of airway disease and approaches using more comprehensive assessment of inflammatory networks in the airway perhaps using 'omic approaches such as transcriptomics will be informative. 37 Such studies in small numbers suggest similar groupings described here with transcriptional profiles associated with cellular profiles and further studies are awaited.
In conclusion, we found here that sputum inflammatory mediator profiling can determine distinct and overlapping groups of subjects with asthma and COPD. We identified an asthmapredominant cluster with eosinophilic inflammation and elevated T H 2 inflammatory mediators, a COPD-predominant group with elevated proinflammatory cytokines, and an asthma and COPD overlap group that clinically had chronic bronchitis, increased bacterial colonization, elevated sputum IL-1b and TNF-a levels, and a sputum neutrophilia. We predict that these groups might contribute to improved patient classification to enable a stratified medicine approach to airways disease.
Clinical implications: Sputum cytokine profiling can determine distinct and overlapping asthma and COPD subgroups supporting both the British and Dutch hypotheses of airway disease.

STATISTICAL METHODS
Statistical analyses were undertaken by Ghebre and verified independently by Newby. Unsupervised multivariate modeling using factor analysis, principal factor, with orthogonal varimax rotation, was performed to obtain a set of low-dimensional independent and interpretable factors. Sampling adequacy for factor analysis was assessed using Kaiser-Meyer-Olkin. E1 Factors were retained on the basis of screeplot (factors above the break in the curve) and eigenvalue above 1. Inflammatory mediators that have high collinearity E2 were excluded from factor analysis to avoid multicollinearity; in addition, cytokines that have less than 50% communality, E3 total variance explained by the factors, were excluded from the analysis. Because CXCL10 and 11 were highly correlated, and CXCL11's variance was better explained by the factors explored than was CXCL10's variance, CXCL10 was excluded from the model. More than 50% of the subjects had undetectable concentrations of IL-4, IL-17, and IFN-g. Their distributions were positively skewed and their total variance was not explained by the factors by more than 50%; therefore, they were not included in the factor and cluster analyses. Likewise, because IL-10 levels in more than one-third of asthmatic patients were below the limit of detection, but the concentrations were not different between asthma and COPD then, to avoid bias toward one disease, it was excluded from the model. No similar bias was observed for other inflammatory mediators. Fifteen subjects (1 with asthma and 14 with COPD) who did not have complete record of the cytokine panel were excluded from the biological factor and cluster analysis. Factor scores were calculated for each subject using standardized values and inverse of correlation matrix of the original variables (inflammatory mediators), and factor loadings (Table E1). These scores represent the subjects' predicted value for each factor and retain the relationship between factors, and were used for constructing clusters. E4 Although it was possible to identify the underlying structure of the inflammatory mediators using factor analysis, it was impossible to classify subjects into subgroups who have the same underlying structure using factor analysis. Therefore, ''k-means cluster analysis on factor scores'' approach was used to identify subjects' subgroups that have similar biological profiles. Squared Euclidian distance was used as a measure of similarity. The use of these 2 approaches (cluster and factor analyses) provides a dimensional view of both cytokines and subgroups and was used to explore the unobserved subjects heterogeneity: the factor analysis to capture the profiles of the cytokines and k-means cluster analysis to allow for the classification of individuals into subgroups. The optimal number of clusters was chosen on the basis of screeplot (clusters above the break in the curve) by plotting within cluster sum of the squares against a series of sequential number of clusters E5 and by assessing how natural the clusters look on their clinical and biological implications (phenotypes) and clinical meanings and interpretability.
Linear discriminant analysis was performed on the cytokines across the clusters to validate how the identified clusters from the factor scores can be predicted using cytokines measurements, and to identify the contribution of each cytokine in discriminating the clusters. Then, discriminant scores (1 less than the number of clusters) for each subject were calculated using the discriminant functions of a cytokine and the original cytokine values for each subject, and used to represent the subjects' biological cluster membership graphically.
Independent patients with severe asthma and COPD were used for validation. Because the number of cytokines in the validation studies is smaller than the number of cytokines used to identify subgroups in the test study, clusters in the test study were predicted using only those cytokines that existed in the validation group. The betas for each cytokine in each cluster are depicted in Table E3. The classification model for each cluster was developed from the betas for each cytokine in the test study and was subsequently applied in the validation study to assign subjects to subgroups on the basis of the highest discriminant score.
The model fit was as follows: where D ij is the discriminant score for subject i in group j, b j is a constant for the jth group, b jk is the weight (coefficient) for the variable (cytokine) K in group j, X ki is the observed value of subject i on the kth variable; and log (P j ) is logarithm of prior probability of group j membership. In addition, classification and regression trees analysis using the rpart R package E6 was performed sequentially to all cytokines in the test study, which had high discriminant function, to identify clinically relevant cutoff points. Subsequently, the best determined cytokine cutoff (with the highest sensitivity ratio in discriminating the clusters), together with the disease classification (asthma or COPD), was applied to classify the validation study into subgroups.
Parametric data were expressed as mean with SEM, and log transformed data were presented as geometric mean with 95% CI. The x 2 test or the Fisher exact test was used to compare proportions, and 1-way ANOVA was used to compare multiple groups; nonparametric data were presented as median with first and third quartiles and Kruskal-Wallis test was used to compare subgroups. A P value of less than .05 was taken as statistically significant. All statistical analyses were performed using STATA/IC version 13.0 for Windows (StataCorp) and R version 2.15.1 (R Foundation for Statistical Computing).   Factor loading less than 0.45 were replaced with blank, but included when estimating the factor scores. Highest loading variables for each factor are in boldface. C1, Proportion of total variation accounted for by the common factors (common variance); Pr, percent of total variation accounted for by each factor; U2, proportion of total variation not accounted by the common factors (unique variance); VEGF, vascular endothelial growth factor.