In this issue of the Journal of Human Genetics, Lynn Bekris et al. report on the effect of putative cis-regulatory haplotypes on in vitro expression driven by TOMM40 and APOE promoters.1 The authors take on a labor-intensive reporter-gene approach to begin teasing apart the regulatory interactions that govern expression in a relatively small and gene-rich haplotype block that encompasses these two genes. For almost two decades, the APOE gene has been targeted for its undeniable involvement in Alzheimer's disease. More recently, its neighboring gene, TOMM40, has also gained attention as a potential player.

Importantly, the results from Bekris et al. support the notion that genetic variation at the TOMM40 locus may be associated with late-onset Alzheimer's disease (LOAD), independently of APOE. Genome-wide association studies of LOAD have consistently found the strongest evidence of association with single-nucleotide polymorphisms (SNPs) within the TOMM40 gene. Ever since the initial observation, the argument has centered on whether this effect is independent of APOE, because these genes reside in close proximity within a locus that demonstrates strong linkage disequilibrium. Thus, it is conceivable that the TOMM40 polymorphism is behaving as a surrogate for the well-established AD risk allele, APOE ɛ4. An article and subsequent review published in 2010 by Allen Roses et al.2, 3 suggested that longer repeats in a variable-length polymorphism (poly T) in TOMM40, unlinked to the ɛ4 allele of APOE, associate with an earlier age-of-onset for LOAD. Subsequent to these publications, at least three groups have attempted to replicate this finding. One supported the notion that the TOMM40 long-repeat allele increases the risk when ɛ4 is absent,4 another concluded that the repeat length was not associated with age-of-onset in LOAD,5 and the third found that the polymorphism was associated with the risk of LOAD in APOE ɛ3 homozygotes, but in the opposite direction.6 Clearly, the implications of this repeat polymorphism on AD pathogenesis remain to be elucidated.

The article by Bekris et al. demonstrates that attempting to understand the regulation of a gene that has been intensively studied rapidly becomes extremely complex. There is a clear opportunity here for improved bioinformatic approaches to aid not only data interpretation but experimental design as well. In this regard, it is interesting to note that Bekris et al. show that the TOMM40 IVS6 region is functional despite not fulfilling current criteria as used by ENCODE to define such regions, thereby highlighting the limitations/hazards of functional sequence prediction using current predictive software. In addition, this article highlights a number of interesting aspects of this type of functional variant screen. It is clear that this approach is ‘construct’ intensive. This study alone provides data that has been generated from screening over 80 constructs. What we desperately need is a high throughput approach that will allow us to achieve these aims that is; a rapid and robust technique for the evaluation of function of individual SNPs and their haplotypic combinations. Furthermore, this study underscores that regulation of gene expression is ‘context’ specific. Therefore, the studies must be conducted in the relevant cell types and under the appropriate conditions. It is likely that, for example, responses will differ if the cells were subjected to an inflammatory stimulus or oxidative stress. Also, the cell lines used traditionally are immortalized and this in itself may have an impact on the results. Perhaps, the use of primary human cells or lineages derived from patient stem cells should be encouraged, albeit that these are more technically demanding to manipulate.

It is evident that a better understanding of the regulatory mechanisms affecting these genes will help resolve the issues of what genes in this locus are important in Alzheimer's disease, or if indeed we should be considering combinatorial effects. The results from this paper make a significant contribution to the argument that the TOMM40 effect is independent of the risk conferred by the APOE ɛ4 allele. The extended LD block in the region of these genes (perhaps as much as 200 kb) merits investigation using a next generation sequencing approach as rare variants within each of the genes (APOE and TOMM40) that are not in linkage disequilibrium but associate with disease should demonstrate a dichotomy of effect supporting independent effects.