Mouse genome-wide association and systems genetics identifies Lhfp as a regulator of bone mass

Bone mineral density (BMD) is a strong predictor of osteoporotic fracture. It is also one of the most heritable disease-associated quantitative traits. As a result, there has been considerable effort focused on dissecting its genetic basis. Here, we performed a genome-wide association study (GWAS) in a panel of inbred strains to identify associations influencing BMD. This analysis identified a significant (P = 3.1 x 10−12) BMD locus on Chromosome 3@52.5 Mbp that replicated in two separate inbred strain panels and overlapped a BMD quantitative trait locus (QTL) previously identified in a F2 intercross. The association mapped to a 300 Kbp region containing four genes; Gm2447, Gm20750, Cog6, and Lhfp. Further analysis found that Lipoma HMGIC Fusion Partner (Lhfp) was highly expressed in bone and osteoblasts. Furthermore, its expression was regulated by a local expression QTL (eQTL), which overlapped the BMD association. A co-expression network analysis revealed that Lhfp was strongly connected to genes involved in osteoblast differentiation. To directly evaluate its role in bone, Lhfp deficient mice (Lhfp-/-) were created using CRISPR/Cas9. Consistent with genetic and network predictions, bone marrow stromal cells (BMSCs) from Lhfp-/- mice displayed increased osteogenic differentiation. Lhfp-/- mice also had elevated BMD due to increased cortical bone mass. Lastly, we identified SNPs in human LHFP that were associated (P = 1.2 x 10−5) with heel BMD. In conclusion, we used GWAS and systems genetics to identify Lhfp as a regulator of osteoblast activity and bone mass.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 in the same pathway or are functionally related [29]. Therefore, by using co-expression networks, which cluster genes based on patterns of co-expression across a series of perturbations [30], it is possible to develop hypotheses as to the function of a novel gene. When a locus has been resolved down to a small number of genes using genetic methods, unknown or poorly characterized genes can be ranked as the most likely candidate based on their function predicted from a co-expression network generated in a disease relevant tissue or cell-type [12,31].
Here, we used GWAS in an inbred strain panel to identify two chromosomal regions harboring variants influencing BMD. One of the associations, located on Chromosome (Chr.) 3, affected BMD in both sexes and was replicated in two separate inbred strain panels and an F2 intercross. This locus mapped to a 300 Kbp interval (NCBI37/mm9; Chr3:52.5-52.8 Mbp) encompassing four genes, Gm2447, Gm20750, Cog6, and Lhfp. An eQTL analysis and examination of a bone co-expression network suggested that Lhfp was a causal gene at this locus. The analysis of BMD, and other bone parameters, in Lhfp mutant mice supported this hypothesis. Additionally, SNPs within human LHFP were associated with heel BMD. Thus, we have used GWAS and systems genetics to identify Lhfp as a novel regulator of bone mass.

Identification of genome-wide associations for BMD
We performed a GWAS for total body BMD in 26 classical (non wild-derived) inbred strains at 12 months of age fed a chow diet. Genome scans were performed separately for each sex using the Efficient Mixed Model Algorithm (EMMA) to account for population stratification (S1 and S2 Files). In female mice, a significant (permutation determined threshold of -log10 (P)>6) association was identified on Chromosome (Chr.) 3 and, in males, significant (permutation determined threshold of -log10(P)>5.9) loci were identified on Chrs. 2 and 3 (Fig 1).

Replication of the Chr. 3 association
Given our goal of identifying novel genes influencing BMD, we selected the Chr. 3 locus for further investigation. This locus was chosen because it was the most significant and the only one identified in both sexes (Fig 1). However, upon closer inspection, Chr. 3 harbored two associations, with peaks at 52.5 and 63.3 Mbp. In males, the 52.5 Mbp peak was the most significant (-log10(P) = 11.5), whereas in females the 63.3 Mbp peak was the most significant (-log10(P) = 5.9). The lead SNPs at both peaks were in moderate linkage disequilibrium (r 2 = 0.46), making it unclear if they represented independent loci. We performed conditional analyses in males and in both cases each peak still exceeded chromosome-wise significance (-log10 (P)>2.9) after controlling for the other, suggesting they represent independent loci.

The Chr. 3 association implicates a 300 Kbp interval encompassing four transcripts
The set of SNPs that were the most significantly associated with BMD spanned a 300 Kbp interval from 52.5 to 52.8 Mbp (Fig 3A and 3B). This region contained four RefSeq transcripts: Gm2447, Gm20750, Cog6, and the 5' end of Lhfp. Gm2447 and Gm20750 were listed as "predicted" RefSeq transcripts and annotated as long non-coding RNAs (lncRNAs). The evidence for these transcripts was based on prediction models and a small number of expressed sequence tag (EST) sequences. Neither of these transcripts have homologs in humans, rats, or any other mammalian species. To determine if Gm2447 and Gm20750 were expressed in mouse bone or bone cells, we performed total RNA-seq (poly A+ and poly A-) on three bone and three marrow-derived osteoblasts samples. Gm2447 and Gm20750 were not expressed, whereas the other two transcripts, Cog6 and Lhfp, which are well-annotated protein-coding sequences, were highly expressed in both bone (Fig 3C) and osteoblasts (Fig 3D). We also analyzed the expression profiles of Cog6 and Lhfp in 96 mouse tissues and cell lines using data available from BioGPS (http://biogps.org/) [33]. Cog6 was highly expressed in all tissues profiled (Fig 3E). Lhfp showed a more restrictive expression profile (Fig 3E). Importantly, Lhfp expression in primary calvarial osteoblasts was among the highest of any of the 96 samples surveyed (Fig 3E). Cog6 is part of the conserved oligomeric Golgi complex required for maintaining normal structure and activity of the Golgi apparatus [34]. Lhfp is a member of the lipoma  HMGIC fusion partner (LHFP) gene family with no known function [35]. All other transcripts on either side of the region were >200 Kbp away.

Coding polymorphisms in Cog6 and Lhfp
We cannot exclude Gm2447 and Gm20750 (or for that matter other genes flanking the association); however, based on the data above we focused on interrogating Cog6 or Lhfp as potential causal genes. First, we evaluated Cog6 and Lhfp for coding polymorphisms among inbred strains. Based on whole genome-sequence data from C57BL6/J and C3H/HeJ (which carry alternative alleles at the association) there are no coding variants between the strains for Lhfp [36]. In contrast, there were three non-synonymous SNPs in Cog6 between B6 and C3H. These SNPs resulted in (rs30302002) I461V, (rs30323949) V620I and (rs30323946) S643N amino acid substitutions. However, using PolyPhen2, SIFT, and PROVEAN all three substitutions were predicted to be benign/tolerated and not impact Cog6 function [37][38][39].

Lhfp is regulated by a local eQTL in liver
We next determined if the same SNPs associated with BMD regulated the expression of Lhfp or Cog6 (or any gene ± 1Mbp of the association). We searched for local expression quantitative trait loci (eQTL) using expression data in liver, brain, adipose and muscle tissues in the BXH F2 intercross. Although expression data on bone or bone cells would have been ideal for this analysis, these data were not available. This did, however, allow us to identify local eQTL that might also be operative in bone. We observed a highly significant local eQTL for Lhfp (LOD = 19.9) in liver (Fig 4A). Cog6 and all other genes in proximity of the region were not regulated by a local eQTL in any tissue (max cis eQTL LOD = 1.8 across all four tissues). The lead Lhfp eQTL SNP (rs3665395) was located in the first intron of Lhfp and B6 alleles of rs3665395 were associated with increased expression of Lhfp relative to C3H alleles (Fig 4B). In liver, we observed a negative correlation (r = -0.29, P = 1.5 x 10 −4 ) between Lhfp and BMD, as would be expected given that B6 alleles of rs3665395 were associated with decreased BMD and increased Lhfp (Fig 4C). We also searched for local eQTL using expression data from bone in the Hybrid Mouse Diversity Panel (HMDP) panel, but did not identify local eQTL for either Cog6 or Lhfp.

Network analysis predicts a role for Lhfp in the regulation of osteoblast activity
Our group and others [12,27,31,40] have shown that co-expression network analysis can identify interactions among genes and knowledge of these interactions can assist in predicting gene function and/or the cell type in which a gene is operative. Therefore, we next used a bone co-expression network to further evaluate Lhfp and Cog6. For the analysis we used a previously generated whole bone (femur with marrow removed) co-expression network from the Hybrid Mouse Diversity Panel (HMDP) that consisted of 13,759 genes partitioned into 21 co-expression modules [41,42]. In this network, Lhfp was a member of module 9 and Cog6 was a member of module 2. Module 2 was enriched in a large number of gene ontology terms including "mitochondrion", "oxidative phosphorylation" and "actin cytoskeleton"; all of which are important to bone. However, module 2 did not have a signature of a particular bone cell-type, nor was it enriched for genes known to influence BMD. In contrast, we have previously demonstrated that module 9 is enriched for genes 1) directly involved in osteoblast differentiation, 2) implicated by BMD GWAS, and 3) when knocked-out in mice impact BMD [31,42].
To investigate specific network connections for Lhfp and Cog6, we identified the 150 genesmost strongly connected to each gene in their respective module (S4 and S5 Files). The genes with the strongest connections to Cog6 were enriched for genes involved in "muscle structure development" (FDR = 2.9 x 10 −10 ), "muscle cell development" (FDR = 4.4 x 10 −10 ), among many other similar muscle-related categories (S6 File). In contrast the genes with the strongest connections to Lhfp were, similar to module 9, enriched for genes involved in "ossification" (FDR = 1.5 x 10 −7 ), "osteoblast differentiation" (FDR = 8.0 x 10 −4 ), "skeletal system development" (FDR = 6.4 x 10 −3 ), "bone development" (FDR = 3.3 x 10 −2 ), among many other related bone-related functional categories (S7 File and Fig 5A). The Lhfp-centric network contained a number of genes with key roles in osteoblast differentiation and activity, including Sp7, Pthr1, Akp2, Tmem119, and Bmp3 (Fig 5B). Together, these data suggest that Lhfp is involved in the activity of osteoblasts, a process of direct relevance to the regulation of bone mass.

Lhfp regulates the number and osteogenic differentiation of bone marrow stromal cells
Bone marrow stromal cells (BMSCs) are adherent marrow cells that contain the mesenchymal progenitors of osteoblasts [43]. To test the role of Lhfp in osteoblast function, we quantified the number of BMSCs and their ability to form osteoblasts from mice lacking Lhfp. Using CRISPR/Cas9, we created five small deletions (ranging from 4-16 bps) in exon 2 (ATG start codon is in exon 2) of Lhfp ( Table 1). All five were frameshift mutations resulting in a truncated LHFP protein (S1 Fig). As expected, we observed significantly decreased Lhfp transcript levels in heterozygotes (Lhfp +/-) and mutants (Lhfp -/-) from all five lines (Fig 6A). Since all five mutations impacted Lhfp expression in the same manner, we grouped littermate mice by genotype from all lines for all downstream experiments.
Next, we performed colony-forming unit-fibroblast (CFU-F) assays, a direct measure of BMSCs, in 16 week-old Lhfp -/and littermate Lhfp +/+ mice. We observed similar trends in both sexes; therefore, all data were combined and adjusted for the effects of sex to increase power. In Lhfp -/mice, we observed a significant (P = 0.02) increase in CFU-F number (Fig 6B). We next evaluated the ability of BMSCs from Lhfp +/+ and Lhfp -/mice to differentiate into mineralizing osteoblasts. Consistent with network predictions, Lhfp -/-BMSCs exhibited increased mineralization as measured by bound alizarin red (P = 0.02; Fig 6C).

Lhfp regulates cortical bone mass
We next determined if bone mass was altered in Lhfp mutant lines. To replicate the conditions of the Ackert inbred strain panel and the BXH F2, we generated two cohorts of mice. The first  was fed a chow diet for 12 months, while the second was fed a high-fat diet from 8 to 32 weeks of age. Based on the negative correlation between BMD and Lhfp expression, the direction of the genetic effects on expression in liver, and increased osteoblast activity observed above, we  Table 1). Data are presented as means ± 1. predicted increased BMD in Lhfp -/mice. In both cohorts, BMD was measured in mice of all three genotypes and cortical and trabecular microarchitecture was measured by microCT only in Lhfp +/+ and Lhfp -/mice. At 32 weeks of age in mice on a high-fat diet we observed significantly (P = 2.6 x 10 −3 ) increased femoral BMD as a function of mutant Lhfp alleles in females, but not males (Fig 6D).
BMD is an inherently noisy phenotype; therefore, to generate a more detailed understanding of the effects of Lhfp in bone we used microCT to investigate the amount of bone in both the femoral trabecular and cortical compartments. We did not observe effects on trabecular bone mass at the distal femur in either male or female mice. However, Lhfp -/mice of both sexes had significantly (P<0.05) increased femoral cortical bone area fraction (BA/TA) and cortical thickness (Ct.Th) as compared to Lhfp +/+ littermates (Fig 6E and 6F). We also observed a significant (P = 0.03) increase in tissue mineral density (TMD) in male Lhfp -/mice (Fig 6G). In general, we observed the same trends of increased cortical bone mass in Lhfp -/mice at 52 weeks of age; however, only Ct.Th in females was significant (P = 0.03) (Fig 6H-6K). These data indicate that Lhfp is a negative regulator of cortical bone mass in both male and female mice. They are also consistent with Lhfp underlying, at least in part, the BMD association on Chr. 3@52.5 Mbp.

Variants in human LHFP are associated with BMD
We next determined if the human region syntenic with the mouse BMD locus harbored SNPs associated with variation in BMD. For this analysis we utilized data from the largest GWAS performed to date (N~426K) for heel BMD [9]. Heel BMD has been demonstrated to be highly genetically correlated with BMD at more clinically relevant sites such as the spine and femoral neck [8,9]. The human region syntenic with the mouse Chr. 3@52.5 Mbp spanned from 39.9 to 40.6 Mbp on Chr. 13. This region harbored 4055 SNPs. A set of 14 SNPs were significantly associated (P = 1.2 x 10 −5 ) after adjusting for the total number of SNPs in the region (P<1.23 x 10 −5 ) (Fig 7). These SNPs were located in intron 3 of LHFP. We queried eQTL for the Gene Tissue Expression (GTEx) project [44], but there were no eQTL for any genes ± 1 Mbp of the association that colocalized with the heel BMD association. Though these data do not directly implicate LHFP, they do support its potential involvement in the regulation of human BMD.

Discussion
In this study, we used GWAS in a mouse inbred strain panel and a multifaceted systems genetics approach to identify and validate a high-resolution association for BMD on Chr. 3. The association directly implicated four genes: Gm2447, Gm20750, Cog6 and Lhfp. Of these, Lhfp expression was regulated by a local eQTL in liver and was predicted, based on a bone gene coexpression network, to be involved in osteoblast-mediated bone formation. We demonstrated that mice deficient in Lhfp displayed increased BMSC number and increased BMSC osteogenic differentiation. Furthermore, Lhfp -/had increased BMD due to increased cortical bone mass. Together these data strongly suggest that Lhfp is responsible, at least in part, for the BMD association we identified on Chr. 3@52.5 Mbp. This work defines Lhfp as a negative regulator of the pool of osteoprogenitor cells, osteoblast activity, and cortical bone mass.
GWAS in mice has proven to be a powerful approach for the identification of genomic regions harboring trait-associated genetic variation [11]. The earliest applications of GWAS in mice used panels of readily accessible inbred strains [18][19][20]. However, such approaches were plagued by false positives due to population stratification [22]. Aware of this limitation, we first performed GWAS for BMD after correcting for population structure in inbred strains and then replicated the analysis in two separate strain panels (containing many of the same strains, but representing independent measures of BMD in different environments) and an F2 intercross. Of the multiple loci identified, the association on Chr. 3 at 52.5 Mbp was identified in all datasets, strongly suggesting it represents a bona-fide genetic association.
The Chr. 3 locus, as defined by the interval harboring the most significant SNPs, contained four genes; Gm2447, Gm20750, Cog6 and Lhfp. Gm2447 and Gm20750 were both predicted lncRNAs. This prediction is based on limited data and the fact that we did not observe their https://doi.org/10.1371/journal.pgen.1008123.g007 expression in bone tissue or osteoblasts (though we only measured their expression in one inbred strain), suggest they are not likely causal for the locus; though, this alone is not enough to definitely exclude their involvement. For Cog6 and Lhfp we used eQTL data and a bone coexpression network to assist in evaluating their potential causality. Both analyses supported a role for Lhfp. Using eQTL data from liver tissue in the BXH F2 intercross, we observed that variants associated with decreased BMD were associated with increased expression of Lhfp. We did not observe an association between the BMD-associated variants and Cog6 expression. Furthermore, Lhfp was a member of a well-studied module of co-expressed genes in mouse bone. This module is highly enriched for genes that play a role in osteoblast function, which provides a direct explanation as to how Lhfp may be impacting BMD. In contrast, Cog6 was a member of a module enriched for genes involved in a wide range of "energy-generating" functions. Importantly, all of our experimental results confirmed that Lhfp is a negative regulator of osteoblast activity and BMD. While these data support a role for Lhfp in the effects of the Chr. 3 locus, they do not exclude any of the other genes in, as well as flanking the locus.
In all four genetic populations used to identify the association on Chr. 3@52.5 Mbp, the strength of the association differed by sex. For example, in the "Ackert" population the association was stronger in males relative to females. In the "Naggert" strain set the strength of the association was similar in both sexes, albeit both were lower than seen in the other three populations. Similar to the "Ackert" strains, the association was stronger in males than females in the "Tordoff" strain set. In the BXH F2, the Chr. 3 QTL was male-specific, with little to no signal in females. The increase in cortical bone mass in Lhfp -/mice was also sexually dimorphic. Although Lhfp deficiency increased cortical bone mass in both sexes in general, the effects were slightly more pronounced in females than males. This discrepancy could be the result of inaccuracies in estimating genetic effect sizes in the relatively small strain sets, the extent of linkage in the F2 confounding the sex effects, or influences from the different genetic backgrounds of the populations studied (strains sets vs. F2 vs. knockout).
Little is known regarding the molecular function of Lhfp. Lhfp is a member of the Lhfp-like gene family, which is a subset of the superfamily of tetraspan transmembrane protein encoding genes. It was first identified as a translocation partner with the HMGIC gene in benign lipomas [35]. The human LHFP/COG6 locus was also identified by GWAS as harboring variants associated with hippocampal volume [45]. However, prior to this study Lhfp had not been connected to the regulation of osteoblast function or BMD. Based on our experimental results, we hypothesize that Lhfp regulates bone mass through a role in cells of the osteoblast lineage. This does fit with prior work implicating Lhfp in the mesenchymal differentiation of gliosarcoma [46]. It is possible that Lhfp serves as a "brake" regulating the number of osteogenic precursor cells in the bone marrow microenvironment as well as their differentiation potential. However, further work will be required to elucidate its precise molecular role in osteoblasts and bone.
In summary, we have used GWAS in a set of inbred strains to identify an association impacting femoral BMD on Chr. 3 at 52.5 Mbp. We show using a variety of approaches that Lhfp is likely responsible for most, if not all, of the effects of this locus. Our results identify Lhfp as a novel negative regulator of osteoblast function and BMD and increase our understanding of the genetics of BMD.

Ethics statement
The animal protocol for the generation and characterization of Lhfp mutant mice was approved by the Institutional Care and Use Committee (IACUC) at the University of Virginia.

Association analysis
The "Ackert" strain set contained BMD data on 32 inbred strains at three time points (6, 12 and 18 months). These data were collected by The Jackson Laboratory Nathan Shock Center of Excellence in the Basic Biology of Aging. Cohorts of males and female mice of 32 inbred strains of mice were aged to 6, 12 and 18 months of age. The number of mice per sex and per strain for each age point ranged from 1 mouse to 9 mice per group, with the majority of groups containing 6-7 mice. For the 12 month data set, the focus of this paper, the group size ranged from 3 to 9 mice per strain, per sex. At 6, 12 and 18 months of age, a varity of phenotypes were measured using a cross sectional study design with the hopes of capturing the main definers of Healthspan. This study, and the phenotypes available, is described in detail elsewhere [47]. Whole body BMD, sans the head, was measured by Dual X-ray Absorptiometry as previously described [48]. The complete dataset is available from the Mouse Phenome Database (MPD) (https://phenome.jax.org/projects/Ackert1). After removing wild-derived strains, and C57BLKS/J (due to inclusion of this strain producing spurious results) we were left with data on 26 strains. To identify loci influencing BMD, we used the Efficient Mixed Model Association (EMMA) algorithm [23]. For the analysis BMD was rankZ transformed. SNPs were obtained from strains genotyped on the Mouse Diversity Array (http://churchill-lab.jax.org/ website/MDA) [49]. SNPs with a minor allele frequency < 0.05 were removed, leaving 228,085 SNPs. These SNPs were used to generate a kinship using the 'emma.kinship' R script available in the EMMA R package (available at http://mouse.cs.ucla.edu/emma/) [23]. The emma. REML.t function of EMMA was used to perform all mapping analyses. The significance of the maximum association peak was assessed by performing 1,000 permutations of the data. In each permutation, the minimum p-value was recorded to produce an empirical distribution of minimum permutation p-values. The quantiles of this distribution were used to assign adjusted p-values. P-values exceeding a genome-wide significant of P<0.05 were used as thresholds to identify associated loci. GWAS resulted were visualized using the "qqman" R package [50]. Replication of the association on Chr. 3@52.5 Mbp was performed using "Tordoff" and "Naggert" inbred strains sets. These data are available from MPD (https://phenome. jax.org/projects/Naggert1 and https://phenome.jax.org/projects/Tordoff3). Replication analyses were restricted to Chr. 3 and otherwise performed as described above.

Generating expression profiles for transcripts in the BMD locus
Femora were isolated from an inbred Collaborative Cross strain (CC016/GeniUnc; Jackson Lab Stock #024684) (N = 3 mice). Marrow was isolated and bone marrow stromal cells (BMSCs) were differentiated as described below. Total RNA was then isolated from bone and BMSC-derived osteoblasts using RNeasy Plus Mini Kit (Qiagen). RNA-Seq libraries were constructed using TruSeq RNA Library Prep Kit v2 sample prep kits (Illumina). Samples were sequenced to an average depth of 24.6 million 2 x 75 bp paired-end reads on an Illumina Next-Seq500 sequencer. Fastq files were aligned to the mouse reference (GRCm38) using HISAT2 v 2.0.5 (https://ccb.jhu.edu/software/hisat2/index.shtml) with a SNP aware reference index (gen-ome_snp) [51]. Expression levels in Fragments Per Kilobase of transcript per Million mapped reads (FPKM) were generated using Stringtie [51]. The data are available from GEO (GSE121887). Microarray profiles for Cog6 and Lhfp in 96 tissues/cell-types were downloaded from BioGPS (http://biogps.org).

EQTL analyses
The generation of microarray expression data and eQTL analyses on bone from the 96 strains of the Hybrid Mouse Diversity Panel (HMDP) has been previously described [41,42]. These data are available from NCBI Gene Expression Omnibus (GEO) (GSE27483). A t-test was used to test for differences in Cog6 and Lhfp expression in strains stratified by genotype at rs3691451 (of the 17 peak BMD SNPs). Liver, brain, muscle and adipose eQTL in the BXH F2 were identified using R/qtl [52]. The expression data is available from GEO (GSE11338, GSE11065, GSE12798, and GSE12795). The genotypes and expression data are also available from GeneNetwork ("BH/HB F2 UCLA", http://www.genenetwork.org/webqtl/main.py).

Network analysis
The generation of a bone co-expression network and characterization of the module 9 (M9) is described in [42]. We identified genes with the strongest connections to Cog6 and Lhfp based on Topological Overlap Measures (TOMs), calculated as described in [30]. Network depictions were constructed using Cytoscape [53]. Gene Ontology (GO) analysis was performed using the PANTHER database statistical overrepresentation test (http://www.pantherdb.org/) [54]. The analysis was restricted to the "GO biological process complete" annotation data set.

Generation of Lhfp mutant mice
The Lhfp knockout mice used in this study were generated using the CRISPR/Cas9 genome editing technique. Cas9 mRNA that was injected into C57BL6/N embryos was synthesized exactly as outlined in [55] while the guide RNA (sgRNA) was generated with some modifications. Briefly, the 20 nucleotide (nt) sequence that would be used to generate the sgRNA was chosen using the CRISPR design tool developed by the Zhang lab (crispr.mit.edu). The chosen sequence and its genome map position is homologous to a region in Exon 2 that is approximately 300 bp, 3' of the start codon (the 'ATG' is located in Exon 2 of Lhfp) (S1 Table). To generate the sgRNA that would be used for injections, oligonucleotides of the chosen sequence, as well as the reverse complement (S1 Table, primer 1 and 2, respectively), were synthesized such that an additional 4 nts (CTTC and AAAC) were added at the 5' ends of the oligonucleotides for cloning purposes. These oligonucleotides were annealed to each other by combining equal molar amounts, heating to 90˚C for 5 min. and allowing the mixture to passively cool to room temperature. The annealed oligonucleotides were combined with BbsI digested pX330 plasmid vector (provided by the Zhang lab through Addgene; https://www.addgene.org/) and T4 DNA ligase (NEB) and subsequently used to transform Stbl3 competent bacteria (Thermo Fisher) following the manufacturer's' protocols. Plasmid DNAs from selected clones were sequenced from primer 4 (S1 Table) and DNA that demonstrated accurate sequence and position of the guide were used for all downstream applications. The DNA template used in the synthesis of the sgRNA was the product of a PCR using the verified plasmid DNA and primers 3 and 5 (S1 Table). The sgRNA was synthesized via in vitro transcription (IVT) by way of the MAXIscript T7 kit (Thermo Fisher) following the manufacturer's protocol. sgRNAs were purified and concentrated using the RNeasy Plus Micro kit (Qiagen) following the manufacturer's protocol.
C57BL/6N female mice (Envigo) were super-ovulated and mated with C57BL/6N males. The females were sacrificed and the fertilized eggs were isolated from the oviducts. The fertilized eggs were co-injected with the purified Cas9 mRNA (100 ng/μl) and sgRNA (30 ng/μl) under a Leica inverted microscope equipped with Leitz micromanipulators (Leica Microsystems). Injected eggs were incubated overnight in KSOM-AA medium (Millipore Sigma). Twocell stage embryos were implanted on the following day into the oviducts of pseudo pregnant ICR female mice (Taconic or Envigo). Pups were screened by PCR of tail DNA using primers 6 and 7 with subsequent sequencing of the resultant product from primer 8 (S1 Table).
Two sets of injections (of~100 eggs each) were performed resulting in 2 mice possessing mutations from each set of injections (A,B and C,D, respectively; Table 1). All 4 mice possessed out of frame bi-allelic deletions ranging from 1-16 bp; progeny from only 3 of the founders (mice A, B, C, Table 1)) were used in this study. Note that an identical 11bp deletion was found in two mice from two separate injections. qPCR with primers 9 and 10 (S1 Table) was used to assess Lhfp expression as outlined in [31].

CFU-F and osteogenic differentiation assays
Isolation [56] and differentiation of mesenchymal stromal cells (MSC) from the bone marrow of mouse femurs was performed as described for osteoblasts [57] with minor modifications. Briefly, one or both femurs from a given mouse were aseptically isolated, denuded of soft tissue and the marrow extracted by removing the proximal end of each bone and centrifuging at 2000 xg for 30 s such that the marrow collects into 25 μl of fetal bovine serum (FBS). Exudates from a single femur were dispersed, via trituration, in 5ml complete media (MEM-alpha, 10% FBS, 100U penicillin/100ug Streptomycin per ml, 2 mM glutamine). Cells were manually counted where upon 4 million were used to seed a 10 cm dish for CFU-F determination and the remainder were applied to a 60mm dish. Media on the 10 cm dishes was changed on days 2, 4 and 8; on day 14, cells were fixed (NBF), stained (Coomassie, BioRad #161-0436) and the number of CFU-Fs determined via Image J analysis and the manual counting of colonies. Media on the 60 mm dishes was changed on day 2 with cells removed via trypsin/EDTA (Gibco) digestion on day 4. Detached cells were triturated in 5ml complete media, pelleted at 1000 xg for 5 min., re-suspended in 1 ml complete media and counted where upon 150,000 cells were used to seed a well of a 12 well dish. A minimum of 2 wells per sample were obtained for all samples reported here in. Osteoblast differentiation was initiated 3 days after plating (7 days after bone marrow isolation) by replacing the media with complete media supplemented with 50 μg/ml ascorbic acid, 10 mM beta-glycerophosphate and 10 nM dexamethasone. Media was changed every other day for 8 days at which time cells were either used as a source for RNA (mirVana, Thermo Fisher) or used to determine the amount of hydroxyapatite formed during differentiation [31]. Briefly, cells were washed with PBS, fixed with neutral buffer formalin (NBF) for 15 min. and subsequently stained with 40 mM Alizarin Red (AR), pH 5.6 for 20 min and washed extensively with H 2 0. The amount of AR bound to mineral was quantitated by Image J analysis of scanned images as well as the 5% Perchloric Acid eluate absorbance at 405 nm.

Measurement of BMD and microarchitecture (microCT)
Femoral BMD was measured ex vivo using a Lunar PIXImus II Mouse Densitometer (GE Medical Systems Model 51045; Madison, WI, USA). Morphologies of the trabecular bone of the distal femur and cortical bone of the femoral midshaft were measured using micro-focus X-ray computed tomography (vivaCT 40, Scanco Medical AG, Bassersdorf, Switzerland) following guidelines for assessment of bone microstructure [58]. Tomographic volumes were acquired at 55 kV and 145 μA, collecting 2000 projections per rotation at 300 millisecond integration time. Three-dimensional 16-bit grayscale images were reconstructed using Scanco Evaluation software, Version 6.5-3. Threshold values were adjusted to best match the silhouette of features of interest in the threshold-subtracted image compared to the grey-scale image. The resulting threshold for hydroxyapatite-equivalent density was 370 mg/cm 3 for compact or cortical bone and 270 mg/cm 3 for the trabecular bone region; these values were applied to subsequent samples. Volumetric analysis was confined to the trabecular region for the distal femur by manual exclusion of the cortical bone. A 1.03 mm high region of interest was analyzed beginning at 1 mm proximal to the growth plate. For the cortical bone, a 0.3 mm high region was analyzed at the mid-diaphysis. Measures analyzing the distal femur trabecular site included total volume, bone volume, trabecular bone volume fraction (BV/TV), thickness, number, connectivity density and structure model index (SMI). Cross-sectional measurements of the cortical bone included bone volume, total volume, marrow area and polar moment of inertia.

Statistical analysis of Lhfp mutant data
All statistical analyses were conducted using the R language and environment for statistical computing [59]. The Lhfp qPCR data was analyzed using a t-test. Data are presented as means ± 1.5 times the interquartile range. CFU-F and osteogenic differentiation data was analyzed using the "lsmeans" R package [60]. The data were fit to a linear model including the effects of genotype and sex. P-values were adjusted using the "tukey" method. Data are presented as lsmeans ± s.e.m. BMD and microarchitectural bone data were analyzed using ANOVA with a linear model including the effects of genotype, body weight, and any other phenotype-specific covariates. Data are presented as means ± 1.5 times the interquartile range.
Supporting information S1 Table. List of primer sequences.