Abstract
Haploinsufficiency is a major mechanism of genetic risk in developmental disorders. Accurate prediction of haploinsufficient genes is essential for prioritizing and interpreting deleterious variants in genetic studies. Current methods based on mutation intolerance in population data suffer from inadequate power for genes with short transcripts. Here we showed haploinsufficiency is strongly associated with epigenomic patterns, and then developed a new computational method (Episcore) to predict haploinsufficiency from epigenomic data from a broad range of tissue and cell types using machine learning methods. Based on data from recent exome sequencing studies of developmental disorders, Episcore achieved better performance in prioritizing loss of function de novo variants than current methods. We further showed that Episcore was less biased with gene size, and was complementary to mutation intolerance metrics for prioritizing loss of function variants. Our approach enables new applications of epigenomic data and facilitates discovery and interpretation of novel risk variants in studies of developmental disorders.