New technologies for delineating and characterizing the lipid exome: prospects for understanding familial combined hyperlipidemia.

This review summarizes the progress made in cutting through the biological and genetic complexity of the Gordian knot that is familial combined hyperlipidemia. We particularly focus on how the application of new genomic technologies, especially massively parallel sequencing and high-throughput genotyping platforms, promise to accelerate the gene discovery process in this common, highly atherogenic disorder, with important diagnostic and therapeutic implications.

This review considers the potential power of new and emerging genomic technologies to solve a clinically important yet seemingly intractable problem, the genetic dissection and consequent biological understanding of familial combined hyperlipidemia (FCHL). The reason for this focus is three-fold. First, the genetic complexity of this condition is undoubtedly rooted in the diversity of the intra-and inter-cellular processes of the various organs that govern serum cholesterol and triglyceride levels (1)(2)(3)(4). As such, only the application of the new generation of genomic tools, in particular massively parallel sequencing (Mps) (5)(6)(7)(8) and microarray technologies (9)(10)(11), has the capability to comprehensively identify and characterize the genomic variants that contribute to the transmission and manifestation of this common, highly atherogenic disorder (12,13). Second, the wealth of data emerging from the application of these tools will provide important genetic information to supplement the traditional family-based, biochemical diagnosis of FCHL, thereby promising to resolve the current controversy over diagnostic criteria (14)(15)(16)(17), as well as offering the prospect of early therapeutic intervention. Third, comparative analysis of the data from multiple human genomes, obtained with the new tools, will enormously expand our knowledge (and perhaps even our understanding) of the contribution that lipids make to a diverse range of cellular activities from conception to grave, including sperm capacitation (18), fetal and embryonic development (19,20), osteoblast differentiation (21), and immunity (22). We may therefore find many examples, as in sickle cell anemia (23), where Mother Nature gives with one hand and takes away with her other.
Current data suggest that both common and rare variants underlie FCHL, with the specific contributions of each form varying from cohort to cohort, family to family, and even from family member to family member (15,(24)(25)(26). In short, the mixed patterns of hyperlipidemia in FCHL families may result from either specific combinations of common variants (Fig. 1), with each allele having only a small influence on serum lipid values (commondisease-common-variant hypothesis), or multiple, rare, functionally deleterious variants each individually affecting a handful of pedigrees (common-disease-multiple-rare-variant hypothesis), or a combination of both.

CHARACTERIZING THE LIPID EXOME
The per-unit cost and high throughput of the new genomic technologies now offer the possibility of sequencing the entire lipid exome (operationally defined as exons of all known "lipid metabolism" genes and liberally extended to include those regulating their pre-and posttranscrip-tional events) of FCHL patients. In fact, three recent studies provide the impetus to do so. In the first, Venter et al. (27) showed that the genome of a healthy individual contained 10,400 nonsynonymous protein coding variants, of which 1,100 are predicted to affect function. These variants were predominantly heterozygous (n 5 801), with a sizeable proportion being rare (n 5 79) or novel (n 5 101) in the human population according to genomic databases. In the next study, Kryukov et al. (28) used data from the Human Gene Mutation Database (containing 50,0001 disease-causing mutations), four large sequencing projects, and humanchimpanzee sequence divergence patterns to assess whether the common-disease-multiple-rare-variant hypothesis is viable. They found that~53% of the observed missense mutations had reached detectable frequencies in the human population (i.e., .1/1,500 chromosomes) despite being deleterious and argued that their prevalence and highly heterogeneous nature was attributable to relatively weak selection pressure being applied to the individual variants. Their calculations also revealed that an average protein of 500 amino acids contains a deleterious missense mutation in~1% of the human population, implying that a hypothetical set of 50 proteins, defined by their contribution to a common biological network, could easily contain such a mutation in one or more members of this protein set in a sizeable proportion of the population. They therefore concluded that multiple rare variants could make appreciable contributions to the etiologies of common, genetically complex diseases. In the third study, Ahituv et al. (29)  individuals. The genes comprised two groups: A) 21 whose disruption caused obesity in either mice or humans; and B) 37 involved in regulating food intake, adipogenesis, energy expenditure, or lipid metabolism but with no genetic evidence for involvement in obesity. Ahituv et al. found that the cumulative frequency of unique nonsynonymous variants across all group A genes was significantly higher in the obese (46 variants in 41 individuals) subjects compared with the lean (26 in 27 individuals) participants. Moreover, it was discerned that a higher proportion of those variants in the obese group were likely to be functionally deleterious than in the lean group (19 vs. 4; P , 0.002). By contrast, the number of functionally deleterious variants in the group B control genes did not differ significantly between obese and lean individuals (21 vs. 27). In summary, the data from these three studies suggest that a highly heterogeneous spectrum of nonsynonymous coding variants is likely to lurk in the genomes of FCHL patients, with a high proportion of the functionally deleterious changes residing in their lipid exomes.
To assess the common-disease-multiple-rare-variant hypothesis for the dyslipidemias, several investigators have already applied yesterdayʼs technologies, yielding some tantalizing results. For example, Wang et al. (30) showed that an LPL or APOC2 coding mutation was present on one allele in 10% of hypertriglyceridemic patients, while Cohen et al. (31) demonstrated that nonsynonymous changes in ABCA1, APOA1, and LCAT were more common (16% vs. 2%) in people with low HDL cholesterol (HDL-C) levels [,5th age-sex percentile (asp)] than in those with high HDL-C (.95th asp). Additionally, Civeira et al. (32) found that 19.6% of their 143 Spanish FCHL patients had a LDLR mutation and understandably argued that these individuals might be more accurately diagnosed as having the FCHL phenotype rather than the disease per se. In detail, using microarray chips containing 203 detrimental LDLR variants, they established that 23 of their patients had a mutation on one allele, while five bore two such variants (presumably on the same allele). Moreover, as would be anticipated, they showed that the patient group bearing a LDLR mutation (22 different variants) had higher mean total cholesterol, LDL cholesterol (LDL-C), and apoB levels, plus smaller waist circumferences, than those who did not. Disappointingly, Civeira et al. (32) did not examine whether the mutations segregated with hypercholesterolemia in their FCHL families, which would have provided more information on the potential overlap between monogenic familial hypercholesterolemia and the more complex disorder, FCHL. In a similar study, Cianflone et al. (26) screened 41 unrelated French-Canadian hyper-apoB/FCHL patients for mutations in the acylation stimulating protein receptor gene, C5L2. A rare variant (i.e., 0/2176 control chromosomes) was detected on one allele (Ser323Ile) in one patient, and subsequently in two of his siblings who also had marked combined hyperlipidemia. This allele was transmitted to four offspring: two who were normolipidemic (i.e., ,90th asp), one who was hypertriglyceridemic (i.e., .90th asp), and one who had hypercholesterolemia (i.e., .90th asp). Thus, although the data for involvement of C5L2 variants in the etiology of FCHL are not yet persua- Family was ascertained through a white British proband (indicated by arrow) with serum TC.95th and TG.90th age-sex-specific percentiles. Exclusion criteria were as described (15). Color coding of variants: TC, red shades; TG, blue; CHL, purple. Phenotypic effect of variants proportional to tile size, frequency is inversely proportional. sive, this study stresses the importance of analyzing large family cohorts with sufficient power to delineate the contribution of rare variants in the transmission of FCHLlipid abnormalities.
Turning to the common-disease-common-variant hypothesis, there is compelling evidence that relatively common variants in the ApoA1/C3/A4/A5 complex contribute to FCHL (15,33), although, as discussed in two excellent reviews (34,35), the exact mechanisms by which these single nucleotide polymorphisms (SNPs) alter serum lipid levels are not fully resolved. By sequencing the retinoid X receptor g (RXRg) gene in 180 hyperlipidemic patients, Nohara et al. (36) found the common variant Gly14Ser. They showed it was more prevalent in FCHL probands (9/60; 15%) compared with peers with other forms of hyperlipidemia (5/120; 4%) and the general Japanese population (15/298; 5%). Rather unexpectedly, they did not find the RXRg Ser14 variant to be more prevalent in patients (age 58 6 7 years) suspected of having coronary heart disease (9/175; 5%). Nonetheless, in this patient group, the Ser14 allele was associated with higher plasma triglyceride They also showed that in vitro, the RXRg Ser14 allele repressed LPL promoter activity more than the wild type. Consequently, we keenly await further genetic and functional evidence that the Ser14 allele, or indeed other RXRg variants, confers susceptibility to FCHL.
After SNPs, insertion/deletions (indels) are the most common type of genetic variation in the human exome (27), and indeed we have known for some time that common indels within the lipid exome (e.g., apoB, LPL) quantitatively influence serum lipid levels (37,38). Less clear is whether there are rare indels making more substantial contributions to dyslipidemia, and it is this question that Mps, combined with high-throughput genotyping methodologies, can readily address. In this respect, the rare PCSK9 indel, designated p.L15_L16ins2L (39), is of interest. It was identified in 2 out of 25 (8%) French-Canadian FCHL families, absent from all 100 controls, and present in just 7 (1.8%) Americans participating in the Lipoprotein Coronary Atherosclerosis study [372 individuals; LDL-C: 2.96-4.90 mmol/L (40)]. In the affected families, heterozygotes (n 5 10) with the p.L15_L16ins2L allele had higher mean total cholesterol and LDL-C levels than the 9 relatives homozygous for the common allele (6.39 6 0.97 vs. 5.18 6 0.81 mmol/L, P 5 0.0067; 4.06 6 0.82 vs. 3.13 6 0.70 mmol/L, P 5 0.0167). However, not all heterozygote family members were hyperlipidemic and, conversely, not all family members who were hyperlipidemic had the p.L15_L16ins2L allele. Thus, further studies are required to establish whether this insertion only contributes to the FCHL-cholesterol trait on a specific genetic background(s) and if so, by what mechanism(s).
Recent technical and statistical advances (41)(42)(43)(44) indicate that it is now timely to examine the copy number variant/polymorphism content of FCHL-patientsʼ lipid exome, the rationale being that copy number variants contribute to autism (45) and neurodevelopmental pathways in schizophrenia (46). Additionally, psoriasis is associated with increased b-defensin genomic copy number (47).

REVISITING FCHL LINKAGE INTERVALS
As spelled out by Kahvejian et al. (5), the new sequencing technologies have yet to reach the nucleotide unit-cost that would permit submission of a grant application to sequence the entire genome of, say, 200 FCHL patients and controls without automatically being filed under fantasy. Thus, for now, much high-density, sequence-driven FCHL research will necessarily be confined to analyzing those genomic intervals (lipid exomic or otherwise) linked to its component traits (15,48). Despite this restriction (which until recently would itself have seemed a luxury), this approach, combined with targeted transcriptome/ expression analyses, promises to clarify, to an unprecedented degree, the causes of FCHL and the cellular processes that run amok in this condition. Recent results from Lee et al. (49) support this stance. Proceeding from a welldocumented linkage interval, they painstakingly accrued genotype data from family cohorts and two case-control samples, finding a common variant [or allele(s) in linkage disequilibrium] within the a priori biologically unconnected tumor suppressor gene, WWOX (WW-Domain-Containing Oxidoreductase) that modifies HDL-C levels.
In outline, Lee et al. (49) tracked the transmission of 1,318 SNPs to individuals with low HDL-C levels (,10th asp) in 50 Finnish dyslipidemic families (33 FCHL, 17 low-HDL-C pedigrees) displaying evidence for linkage of this trait to the chromosome 16q23-24 interval. They found a series of SNPs showing association (P , 0.01) with the low HDL-C trait, 25 of which were analyzed in a further 52 dyslipidemic families. In the combined dataset of 102 dyslipidemic families, the rs2548861 T allele (frequency in families: 0.38) in the 8th intron of WWOX returned the most significant evidence for association with low-HDL-C (P 5 0.001). Moreover, in two Finnish population samples (n 5 3,403 and 1,561), carriers of the rs2548861 T allele (frequencies 0.49 and 0.50) had lower HDL-C levels than individuals without the allele (P 5 0.003 and 0.004), with one copy accounting for~1.5% of the variance in HDL-C values. Hence, this study nicely demonstrates the efficacy of a SNP-based approach to detect modifier variants that have risen to high frequencies in the human population, presumably because they were not evolutionarily disadvantageous. It also highlights that a more in-depth analysis of the WWOX locus is warranted, specifically, to establish whether the rs2548861 T allele is merely acting as a marker for a rarer functional variant(s) that has a more substantial impact on an indi-vidualʼs HDL-C than current estimates and, additionally, to identify whether rare variants at the WWOX locus are enriched in patient groups with low-HDL-C levels.
Many investigators (15) have used genome-wide linkage analyses with mixed success to identify the chromo-somal regions containing variants conferring susceptibility to FCHL-related lipid abnormalities. This prompted Pollex and Hegele (48), just 4 years ago, to suggest that the approach of identifying susceptibility alleles underlying linkage intervals was likely to be too resource intensive and to yield few mechanistic insights. Here, we reconsider this issue after evaluating the potential impact of genetic heterogeneity on linkage signals.
Simulations were carried out on extended white British FCHL pedigrees with two sample sizes: 60 families to emulate the size of previous genome-wide FCHL linkage scans (15,48) and 181 families corresponding to our most recent study. To this end, we declared [based on published (15,48) and our unpublished data] that the component traits of this disorder were caused by variants at any one of 9 distinct pairs of susceptibility loci and applied a phenocopy rate of 10%. Chromosomes assigned no susceptibility loci returned low LOD scores (,0.25) (representative results summarized in Table 1). At susceptibility loci, the 181 family datasets (300 simulations) generated mean LOD scores~2-fold higher than those observed in the 60 family datasets. LOD scores were also generally higher in the chromosomal regions allocated two susceptibility variants (e.g., chromosome 1), and in the single region carrying 3 variants (chromosome 11), LOD scores were higher still. This het-erogeneous region was the only one to consistently return values exceeding the classical thresholds (50) for declaring linkage.
We present the results from the simulations performed on real family structures to make a fundamental point: genuine FCHL susceptibility variants will exist in some of the chromosomal regions that have returned LOD scores just missing threshold levels for declaration of linkage (50), simply because obtaining significant/confirmed LOD scores is difficult in the face of genetic heterogeneity, genomic imprinting, heterogeneity across cohorts, and the multiple other factors highlighted by Pollex and Hegele (48). Thus, although there is a clear need to develop new statistical methods to overcome these obstacles, we suggest that the process of cutting through the Gordian knot of biological and genetic complexity that is FCHL can now be accelerated by incorporating the wealth of information emerging from the application of the new genomic and proteomic technologies (5)(6)(7)(9)(10)(11) with both existing and improved linkage data. Simulations are based on observed distribution of dichotomized trait status in FCHL pedigree structures, recruited as described (15). The model assumed that affectation status was caused by a specific allele at one of nine pairs of susceptibility loci [i.e., 18 individual loci across 11 chromosomes (Chr)]. Each family was randomly assigned one such pair or phenocopy status. Alleles at unassigned locus pairs and loci on completely unlinked chromosomes (UC) were allocated independently of affectations status. Marker characteristics were taken from observed data and 300 replicates performed on the datasets. Multipoint LOD scores are mean maximum 6 SD. For clarity, only one result is shown per chromosome. Threshold LOD scores for suggestive (2.2), significant (3.6), and highly significant (5.4) are from (50).
a Chromosome assigned two closely spaced susceptibility loci. b Chromosome assigned three closely spaced susceptibility loci and a distant locus.