Genetic Dissection of Femoral and Tibial Microarchitecture

ABSTRACT Our understanding of the genetic control of bone strength has relied mainly on estimates of bone mineral density. Here we have mapped genetic factors that influence femoral and tibial microarchitecture using high‐resolution x‐ray computed tomography (8‐μm isotropic voxels) across a family of 61 BXD strains of mice, roughly 10 isogenic cases per strain and balanced by sex. We computed heritabilities for 25 cortical and trabecular traits. Males and females have well‐matched heritabilities, ranging from 0.25 to 0.75. We mapped 16 genetic loci most of which were detected only in females. There is also a bias in favor of loci that control cortical rather than trabecular bone. To evaluate candidate genes, we combined well‐established gene ontologies with bone transcriptome data to compute bone‐enrichment scores for all protein‐coding genes. We aligned candidates with those of human genome‐wide association studies. A subset of 50 strong candidates fell into three categories: (1) experimentally validated genes already known to modulate bone function (Adamts4, Ddr2, Darc, Adam12, Fkbp10, E2f6, Adam17, Grem2, Ifi204); (2) candidates without any experimentally validated function in bone (eg, Greb1, Ifi202b), but linked to skeletal phenotypes in human cohorts; and (3) candidates that have high bone‐enrichment scores, but for which there is not yet any functional link to bone biology or skeletal system disease (including Ifi202b, Ly9, Ifi205, Mgmt, F2rl1, Iqgap2). Our results highlight contrasting genetic architecture between sexes and among major bone compartments. The alignment of murine and human data facilitates function analysis and should prove of value for preclinical testing of molecular control of bone structure. © 2019 The Authors. JBMR Plus published by Wiley Periodicals, Inc. on behalf of American Society for Bone and Mineral Research.

to define subsets of variants that modulate bone volumetric BMD (vBMD). (5,6) Although BMD accounts for about 70% of bone strength, this measurement does not provide the 3D structural precision of high-resolution μCT. (7)(8)(9)(10) Three-dimensional maps of structure and architecture generated by μCT have many advantages, including (1) minimal interference from intra-and extraosseous soft tissues, (2) high-content data acquisition, and (3) isotropic resolution as high as 6 μm. (11,12) Finally, the development of finite element analysis of μCT data makes it possible to model mechanical properties of bone. (13) Experimental rodent models can be used efficiently to evaluate candidate genes discovered in GWASs and convert variants to mechanisms and potentially even to treatments. (14) Although the use of knockouts and knockins of single mutations is a wellestablished approach to test gene function, an alternative unbiased approach is to map quantitative trait loci (QTL) that influence natural variation in bone structure using large families of genetically diverse animals. (15) In this way, the clinical range and complexity of bone disease can be captured, while retaining tight control over diet, environment, and genotypes. For example, the mouse diversity panel (16,17) and the BXD family have been used to define and confirm roles for Asxl2 in BMD and Alpl in hypophosphatasia. (18) In comparison to intercrosses, these large families of isogenic, but diverse strains can be used to systematically test gene-by-environmental interactions, to establish replicability, and to test new therapies and treatments. (19) Families of strains, such as the BXDs, are also advantageous because deep genomic, metabolic, metagenomic, and phenotypic data have already been generated. (20,21) It therefore becomes practical to compare variation in bone and the skeletal system with those of many other traits in many other systems, and at multiple levels of organization. (18,(22)(23)(24)(25) In this study, we have combined deep quantitative phenotyping of bone microstructure with a fine-grained genetic dissection to understand the control of bone architecture, focusing on femur and tibia. We have systematically evaluated cortical and trabecular compartments.
We have also developed and applied a method to rank essentially all protein-coding genes in mammals with respect to their potential roles in bone biology. We define a new set of genes that have a strong likelihood of being related to bone biology, but that currently lack any relevant literature. We refer to this set as a "bone ignorome. (26,27) We have merged information on the ignorome with our set of candidate genes to rank those most likely to contribute to variation in bone structure and function.

Animals
All experimental procedures were in accordance with the Guidelines for the Care and Use of Laboratory Animals published by the National Institutes of Health and were approved by the Animal Care and Use Committee at the University of Tennessee Health Science Center (UTHSC; Memphis, TN, USA).
The BXD family was housed in a single specific pathogen-free (SPF) facility at UTHSC, and maintained at approximately 22 C on a 14/10 hours light/dark cycle. Cases were provided Agway Prolab 3000 (5% fat; Agway, Syracuse, NY, USA) chow and Memphis aquifer tap water ad libitum. Sixty-one BXD strains and both parental strains, C57BL/6J (B6) and DBA/2 J (D2), were sacrificed for tissue harvest. Cases ranged in age from 50 to 375 days with an average of 100 days. A total of 597 animals were studied: 290 females and 307 males. A total of 576 femurs and 515 tibias were harvested. Differences between numbers of animals and bones reflect breakage or loss. Precise numbers of strains vary by trait, sex, and age, but all parameters were evaluated using 50 to 63 unique strains. For details on sample sizes, see Supplementary Data S1 and data in GeneNetwork.org. Many cases analyzed here had been previously studied by Zhang and colleagues, (28) although here we have integrated additional data, including 239 new cases and 15 additional strains. After dissection and removal of soft tissue, femurs and tibias were stored in 75% ethanol until measurement.
μCT measurements High-resolution x-ray computed tomography (μCT40, Scanco Medical, Basserdorf, Switzerland) was used to scan and measure morphometric parameters of femurs and tibias. Bones were placed in a 12.3-mm-diameter sample holder filled with 75% ethanol and immobilized with Styrofoam. Samples were scanned at 8-μm resolution (isotropic voxel size) using an energy level of 55 kVp, an integration time of 300 ms, and an intensity of 109 μA. Morphometric parameters were evaluated using a fixed Gaussian filter and a threshold of 220 for trabecular bone and 250 for both cortical bone and whole bone.
Each femur and tibia were measured separately using Scanco software (Scanco Medical AG, Brüttisellen, Switzerland) that generates more than 50 quantitative traits per bone (whole bone, cortical, and trabecular segments). We selected a subset of 25 of the more interesting and interpretable measurements for in-depth analysis (Table 1). Three whole-bone parameters were used: length, mineralized volume, and material bone mineral density (mBMD). For cortical bone, 100 transverse slices were acquired at the middle of the shaft: a total length of 0.8 mm. From these cross-sections, 11 cortical microtraits were generated, including cortical thickness, cortical volume, porosity, and polar and area moment of inertia. For trabecular bone analysis, 100 slices were acquired at the secondary spongiosa of the distal femur or proximal tibia. Eleven trabecular microtraits were generated, including bone volume fraction (BV/TV), trabecular thickness, trabecular number, trabecular separation, and the trabecular connectivity density (Fig. 1A).

Statistical analysis
Bones were harvested immediately after sacrifice from roughly equal numbers of males and females, ranging from 50 days to 375 days-of-age (100 AE 56 SD). This age range is equivalent to adolescent to middle age in humans. (30) Body weights ranged from 13.4 to 48.5 g (24.9 AE 5.1 SD). The relation between body weight and the logarithm of age fits a linear regression reasonably well in which weight (g) = −3.5 + 14.5 (log 10 of age), with an r = 0.45 and p < 0.0001.
All data and metadata on cases used in this study are provided in the Supplementary Data S1. To minimize effects of age as a confounder, we performed linear regression for each variable across 307 male and 290 female samples separately, using logarithm of age as a predictor. Log age-corrected values were computed by adding the residuals to means for male or female samples separately (Supplementary Data S1, Sheet Female_Ra-w_Res_Corr and Male_Raw_Res_Corr). Because the average age of males and females was approximately 100 days, the corrected values by case, strain, and sex should be considered as those that will typically be measured at this age. Both the original values and the corrected values are provided in Supplementary Data S1. Sex-averaged values were computed as above by fitting the cofactors sex and logarithm of age (without grouping by strain) across the entire data set. Means were added to the residuals, and these values were summarized to generate sex-and age-corrected strain means.
We also analyzed the relation between body weight and bone length/volume before and after correction for age. As expected, there is a strong positive correlation before age correction; body weight and bone volume covary with an r of 0.61 when both sexes are combined. After the log-age correction, there is still a significant association between body weight and bone parameters. For example, the correlation between femur volume and body weight is 0.35. We chose not to correct for this source of variation in subsequent analyses because body weight and size are also key variables of interest. But this does mean that bone data need to be considered in light of general variation in body size. To determine the body-size effects on mapping after eliminating age as a cofactor, we mapped using both data adjusted by logarithm of age only and data adjusted by logarithm of age plus body weight for all traits associated with robust QTL. Effects of sex and strain as predictors were estimated by ANOVA. After generating cleaned and adjusted values, we again searched for outliers at both the level of individual cases and strain means. A few outlier cases and strains (e.g., BXD13 and BXD78) were censored in some analyses as described in the Results section and the figure legends, either by complete removal or by winsorizing outliers. (31) Mean strain data entered into GeneNetwork (GN) have been reviewed, and when necessary, have also been winsorized. However, we do provide the original value in trait descriptions. Users can revert to the original data as needed. Heritability (h 2 ) was calculated for all key traits. The variance and bias of the estimate of h 2 was computed using a jackknife in JMP Pro 12 (SAS Institute, Inc., Cary, NC, USA). (32,33) This involved calculating heritabilities for subsamples of data, each missing all data for one strain (h (−i) ). The jackknife variance is where h Á ð Þ is the mean heritability using all strains. To test for sex-difference heritabilities, we computed the Z value: x 1 and x 2 are the averages of the female and male heritabilities of each bone trait; σ 1 2 and σ 2 2 are the jackknife variances of female and male heritabilities, n 1 = n 2 = 35. |z| ≥ 3.18 is considered significant with a Bonferroni-corrected p of <0.00143 (0.05/35 ffi 0.00143, two-tailed).

Correlation analysis
We studied correlations among bone phenotypes using strain averages. We selected three representative traits for each of three major categories: (1) whole bone (GN Record IDs 18130, 18131, 18132), (2) cortical bone (GN 18134, 18136, 18141), and (3) trabecular bone (GN 18146, 18148, 18149). Because the sample size is reasonably large (n = 63), we used Pearson productmoment correlations, and confirmed results were not sensitive to outliers. Finally, we computed correlations between bone microtraits and a large femur mRNA expression data set (UCLA GSE27483 BXD Bone Femur ILM Mouse WG-6 v1.1 (Jan13) RSN in GN (GN accession number: GN410). This expression data set was generated by Farber and colleagues (17,34) and includes data for 32 BXD strains and many other strains for which we have matched phenotypes. Because the overlap of sample size is modest, we used rank-order correlations to compare bone phenotypes with expression data.

QTL mapping
We initially carried out conventional interval mapping using Haley-Knott regression equations (35) as implemented in GN. To estimate genome-wide thresholds of significance, we permuted phenotypes 2000 times. CIs were defined as the chromosomal region within a 1.5 LOD drop from the linkage peak. We have taken several approaches to evaluate the consistency of QTL results, including the use of two genotype files, different methods to correct for age and sex, and different mapping algorithms that handle kinship relations (see below). In total, we mapped 50 traits for males, females, and for sex-averaged means. Initial analysis was corrected for age, but for comparison, we also remapped using young animals within a relatively narrow age range. All aspects of this analysis can be reviewed, replicated, and extended using the GN record IDs in Table 4 and Supplementary Data S1.

Classic and new genotype files
We used two genotype files for mapping. The first is a file that has been used by almost all investigators from 2002 through to late 2016. We refer to this as the "classic" file because it has been used in hundreds of studies. The second file was released in January 2017 and includes roughly twice as many markers. Both files are available at www.genenetwork.org/webqtl/main.py? FormID=sharinginfo&GN_AccessionId=600.
The two files are similar. The main difference is that several strains were not yet fully inbred during the earlier phase of genotyping, but are now fully inbred. Because the cases we have studied here were born between 2011 and 2013, it is useful to compare results using both files.

Age effects
We were concerned that cases older than 150 days may have had sufficiently different bone architecture that the statistical age adjustment would not fully compensate. For this reason, we compared results based on the complete data set for all 597 cases corrected for log age (GN 18130 to 18279) to a subset of 484 cases ranging from 65 to 116 days and processed without any age correction (GN 18986 to 19086). This age range is equivalent to young adults in humans. (30)  The seven most robust quantitative trait loci (QTL) are shown around the periphery with corresponding trait identifiers and QTL: First letter F or T (femur or tibia), followed by abbreviation of key bone phenotype, and f or m (female or male if a QTL was sex-specific). pmoi = polar moment of inertia; cv = cortical volume; ct = cortical thickness; tn = trabecular number; tt = trabecular thickness. The final number is the chromosome number. For each QTL map, the x axis is given in megabases, the left y axis is the likelihood ratio statistic (LRS) score. The red horizontal line provides the genome-wide significance level based on 2000 permutations. Orange hash along the x axis indicates SNP density. Heavy blue lines provide linkage statistics, whereas thin green and red lines provide an estimate of the additive genetic effect (right y axis). (29) (B) The QTL heat map provides whole-genome mapping results for all seven phenotypes in the form of color-coded horizontal bands. Bands of more-intense color correspond to QTL linkage peaks, and colors encode the additive effect of alleles (blue for B and red for D alleles). (C) Chromosomal ideograms for all 16 significant QTL (chromosomes 1, 2, 6, 7, 9, 10, 11, 12, 13, 17, and X) combined with mouse bone QTL (interval coverage in blue bars) on these chromosomes listed on Rat Genome Database (RGD) from GViewer (www.rgd.mcw.edu/rgdweb/search/qtls.html?term=bone&chr=ALL&start=&stop=&map=360&rs_term=&vt_term=&speciesType=2&obj=qtl&fmt=5). Fttf1b and Fpmoif10a are two QTL with similar phenotypes and map positions (labeled in black). Ttda6, FcvfXa, and FcvfXb are three novel loci that do not overlap those listed in the RGD (labeled in red). All of other 11 QTL overlap some known bone QTL (usually BMD), but are now linked to specific μCT bone traits and are therefore new or refined (labeled in green).

Mapping algorithms
Differences in algorithms and their sensitivities to trait distribution and kinship among strains will have effects on mapping results. We therefore remapped traits, in particular the 13 traits that gave strong initial results, using complementary algorithms. These methods include variants of R/ qtl, (36)(37)(38) and two algorithms that explicitly model kinship, pyLMM (39) and GEMMA. (40,41) All algorithms were run using implementations that are part of GeneNetwork 2 (gn2.genenetwork.org).

Composite interval mapping
For traits with multiple QTL, we wanted to ensure that the loci were not in statistical linkage. (42) To ensure independence, we selected background control markers close to one of the QTL peaks and remapped using composite interval mapping methods. Gene ontology scoring system of candidate genes The 16 QTL we have mapped each contain between 45 to 174 positional candidate genes within the 1.5 LOD CIs, and collectively include 1638 protein-coding genes. Based on our survey of the literature, only 2% (36) have been previously linked to bone biology. To evaluate the remaining 98% of candidates for possible association with bone biology, we developed an efficient, comprehensive, and quantitative method that generates an objective bone score using methods similar to those described in previous work by us and others. (26,27,43) We specifically defined a bone score that estimates the potential association of each gene to bone biology when compared against a reference set of 770 genes already well-known to be associated with 34 bone gene ontology (GO) terms. We exploited the same femur-expression data set (GN410) that includes probes that target essentially all protein-coding transcripts. For each of 46,621 probes, we first calculated the absolute Spearman correlation between the mRNA abundance of the gene and all other transcripts. We then selected the top 1000 probes with the highest correlations and performed a GO enrichment analysis (biological process) based on the hypergeometric test (44)(45)(46) : where M represents the total number of genes targeted by all probe IDs (n = 30,880); N represents the number of unique genes among the top 1000 covariates; m represents the number of genes listed in the GO term; and k represents the number of genes among the top 1000 covariates that are in the GO term. For example, the top 1000 covariates of Alpl (alkaline phosphatase, probe ID ILM2340168) include 369 genes associated with "ossification" (GO:0001503). Even after correction for multiple comparisons, the p value of this GO-term enrichment is 5 × 10 −23 . We computed enrichment p values for a set of 34 bone-associated GO terms (Supplementary Data S2). The average p value of 40 well-known bone genes such as Alpl was used as a reference standard against which we compared all other genes/probes (also see Supplementary Data S2). Many candidate genes had p values that were as good as or better than those of these 369 known bone-associated genes. We converted values to -log 10 (p) across all 34 terms and used the average value as a GO-associated bone score. Genes such as Alpl that are linked to several bone GO terms typically have scores above 1 (the peak score is 10.5 for Col15a1). Genes linked to 10 or more GO terms typically have scores above 2. Genes without known links to bone have averages well below 1. A large subset of genes was further defined as members of the bone ignorome: genes that have unusually high bone scores based on this analysis, but that have no known literature associated with the skeletal system.
Finally, we generated a summary candidate score on a scale of 1 to 10 points for all genes based on: 1 Average bone score (1 to 3). The gene is assigned 1 point if its average bone score across GOs is between 0 and 1; 2 points if between 1 and 3; and 3 points if >3. 2 Highest bone score (0 to 1). Some genes have high GO scores for one or two of the 34 GOs. We therefore assigned these genes 1 point, provided that its highest bone score was between 5 and 10. 3 Coding DNA variants (0 to 3). In a recent study we defined 35,068 coding SNPs in the BXD family, of which 11,979 are nonsynonymous (nsSNP) with high Grantham scores (complete list is given in Supplementary Data S4 of Wang et al. (20) A gene is assigned 1 point if the sum of Grantham values is <100, 2 if the sum is 100 to 300, and 3 if >300. We also  identified 173 SNPs associated with nonsense, frame shift, or splice site mutation (Supplementary Data S5 of Wang et al. (20) Genes with any of these types of variants were assigned 3 points. 4 Cis-regulation (0 to 3). Genes with strong evidence of cisacting control (a so-called cis eQTL) in bone (BXD Bone Femur ILM Mouse WG-6 v1, v1.1 (Jan 13) RSN, GeneNetwork Accession Number: GN410) (17,34) were assigned 3 points. Genes with cis eQTL in cartilage (UCLA BXD Cartilage, GN Accession Number: GN178) or muscle (EPFL/LISP BXD CD Muscle Affy Mouse Gene 1.0 ST (Nov12), GN Accession Number: GN397) were assigned 2 points. Genes with cis eQTL in any other tissue were assigned 1 point.
All candidate genes received a summary candidate score on a 0 to 10 point scale (Supplementary Data S3 and Table 5). We focused analysis on a subset of 212 candidates within 16 QTL with scores greater than 4. With the long-term goal of functional validating some of these candidates, we shortened this list further, and therefore restricted more detailed analysis to the strongest 50 selected across seven robust QTL.

Candidate gene analysis
In addition to NCBI PubMed, we evaluated candidates using data extracted from four major resources: 1. The Rat Genome Database (RGD, www.rgd.mcw.edu). As of our search, RGD provided a list of 855 genes linked to bone biology across human, rat, and mouse.
2. The Human GWAS Gene Compendium (www.ebi.ac.uk/ gwas), using "bone" and "skeletal" as key words at a p threshold ≤5 × 10 −4 . We downloaded a list of 1125 genes associated with diseases that impact bone and the skeletal system.
3. The International Mouse Phenotyping Consortium (www. mousephenotype.org), a collection of phenotype data on mouse knockout lines. We obtained a set of 699 genes associated with abnormal skeletal phenotype (mouse phenotype [MP]: 0005508).
4. Using the 2017 UK BioBank (UKBB) eBMD GWAS Summary Statistics data, (47) we defined bins between the upstream and downstream SNPs with a linkage disequilibrium r 2 of ≥0.7 with the lead SNP, and calculated from European populations in the 1000 Genomes Phase III data. (48) For each bin, we identified all overlapping genes. If no genes intersected a bin, the nearest upstream and downstream genes were included. This yielded 731 genes underlying GWAS loci for eBMD (Supplementary Data S3, Sheet 731_BMD_GWAS_genes).

Results
Variation among strains was moderate to high for almost all bone traits. This included traits such as length, mineralized volume, and material BMD, as well as cortical and trabecular traits, such as thickness, porosity, and polar moment at inertia ( Table 2; GN IDs 18130 to 18279, and Supplementary Data S1). For example, femoral length ranged from 12.2 AE 0.1 mm in BXD27 to 14.7 AE 0.13 mm in BXD55. In contrast, the range of variation of BMD was much less (only 1.7% range). Variation in strain averages was of course much more modest than variation among the entire pool of individuals (Fig. 2). Typical coefficients of variation (CVs) for traits at the individual level ranged from a high of 34% for femur trabecular connectivity density to a low of 2% for tibia material BMD. For example, the threefold

BXD strain effects and heritability
Under carefully controlled laboratory conditions approximately one-third of trait variance was accounted for by strain as a factor, even after controlling for age and sex. Heritabilities ranged from 0.29 to 0.78 across all traits ( Table 2). The range for females was from 0.36 to 0.69 (mean of 0.59 AE 0.02), for males it was as from 0.29 to 0.78 (mean of 0.57 AE 0.01). Given the large sample size (n = 63 strains) and the good control over environmental factors, heritabilities were reassuringly accurate, and had coefficients of error that averaged 1.8%. The highest coefficient of error was 4.3%.

Sex differences
Although heritabilities of male and female traits were closely matched, 0.57 AE 0.01 in males and 0.59 AE 0.02 in females, values for specific bone traits differed significantly, and correlations of heritabilities were surprisingly low (0.29) across 50 common traits. With few exceptions, body weight and absolute values of most traits were greater in males than females (Table 2). For example, body weight, bone length, and bone volume (Fig. 3A, B) were all higher in males. Most microtraits also shared this pattern ( Fig. 3C-F). The few traits that did not show a sex bias were ratio-based measurements, such as femur material BMD (GN 18182 versus 18232), cortical bone porosity (GN 18185 ver-sus18235), and trabecular degree of anisotropy (GN 18204 versus 18254).

Correlational statistics of bone traits
Cortical parameters, including cortical volume, thickness, and polar moment of inertia (pMOI), correlated well with each other ( Table 3). The correlation between cortical volume (GN 18134) and pMOI (GN 18141) was 0.90, whereas that between cortical thickness (GN 18136) and pMOI was 0.47. These traits also correlated with whole-bone volume and BMD. For example, the correlation between total femur volume and cortical volume was 0.63 (GN 18131 and 18134), between femur BMD and cortical thickness it was 0.41 (GN 18132 and 18136), and between total femur      volume and cortical pMOI it was 0.58 (GN 18131 and 18141). This is not a surprising finding because cortical bone comprises the largest fraction of mouse long bone. In contrast, trabecular bone parameters-bone fraction BV/TV (GN 18146), structure model index (SMI, GN 18148), and trabecular number (GN 18149)did not correlate well with whole-bone and cortical bone parameters, except for whole-bone volume (range from 0.32 to 0.50). Both femoral and tibial traits showed the same pattern. This site-specificity suggests that the cortical and trabecular components are differentially modulated by gene variants.

Genetic correlations of bone traits with gene expression
We used GN to extract the top 1000 transcripts with expression that correlate highly to bone phenotypes. Lists of these topranked transcripts (mean expression level >7.0) were exported to WebGestalt (Web-based GEne SeT AnaLysis Toolkit: www. webgestalt.org) for GO analysis. (44)(45)(46)49) We focused attention on transcripts with highly significant correlations and those that encode extracellular bone matrix, calcium-modulating molecules, receptors, second messengers, and relevant hormonal agents and cytokines (Supplementary Data S4). For example, femur trabecular fraction (BV/TV) covaries tightly with Bmp2 and Bmp7, as well as with Ifitm1 and Ifitm5. These genes are potential regulators of ossification and bone mineralization.

Mapping bone microtraits
We generated maps for all phenotypes (GN 18130 to 18279) and identified 16 loci on chromosomes (Chr) 1, 2, 6, 7, 9, 10, 11, 12, 13, 17, and X (Fig. 1C). There were only small differences in QTL peak locations (up to 6 Mb) and linkage statistics using the two genotype files. The average maximum LRS scores for these 16 QTL were 17.2 AE 3.2 using the classic file and 18.2 AE 4.4 using the newer 2017 file. We were surprised to detect no association between heritabilities of traits and the yield of QTL. We evaluated the impact of age difference on mapping results. We compared mapping results from the complete data set of 597 mice (GN 18130 to 18279, with a log-age correction) to those from a subset of 484 mice between 65 and 116 daysof-age without log-age correction (GN 18986 to 19086). The Pearson product-moment correlations between full data with age correction and the trimmed data set without age correction ranged from 0.67 to 0.98 (0.93 AE 0.06 SE, n = 25 male and 25 female traits).
We performed additional analyses using different mapping algorithms, composite interval mapping, and adjustments for body size (see Materials and Methods section). Seven QTL were highly consistent ( Fig. 1 and Supplementary Fig. S1) and associated with bone microtraits of biological interest. (11) The peak linkage scores for these traits ranged from 17 to 22, and all were genome-wide significant. These loci accounted for 25% to 35% of strain mean genetic variance (the r 2 between the best marker in Table 4 and the strain means). Five QTL were related to cortical traits, whereas two were related to trabecular traits, including Fttf1a on Chr 1 and Ttsf11 on Chr 11.

Sex differences in QTL mapping
We succeeded in mapping significant QTL for female traits, but surprisingly not for the corresponding traits in males. The seven robust QTL considered above were all detected in females, but most of the corresponding chromosomal regions in males did not even reach the suggestive criterion ( Table 4). The only exception was tibia cortical thickness in males (GN 18261), which reached a suggestive level in males (LRS approximately 11). Among the subset of seven most robust QTL, four were associated with sex-by-genotype effects. This sex bias in mapping success was unexpected, particularly because heritabilities were so closely matched (Supplementary Data S1). However, we note that prior mouse QTL mapping studies have detected high levels of sexual dimorphism. (50)(51)(52)(53)(54) In addition, male and female body weight covaried well (r = 0.69, n = 53, GN IDs 18547 and 18548), but QTL also differed, with suggestive peaks on Chr 1 in males (LRS of 15 at 120 Mb and a high D allele) and on Chr 8 in females (LRS of 14 at 80 Mb with a high D allele). Of importance, QTL for body weight did not match up with any of the bone trait loci. The converse was also true: In comparison with the only QTL detected in males (GN 18248, LRS 19.7), the corresponding female trait (GN 18198) had an LRS of <1.0 at the same location (approximately 35 Mb on Chr 9). Suggestive QTL also shared the strong female bias: 42 in females, 19 in males.

GO validates known bone-associated genes and defines new bone-associated genes
We extracted sets of genes linked to 34 bone-associated GO terms using femur gene expression data. (17) For example, the GO-term bone trabecula formation includes 10 well-accepted reference genes: Chad, Col1a1, Fbn2, Grem1, Mmp2, Msx2, Ppargc1b, Sfrp1, Thbs3, and Wnt10b. We computed the first three principal components and their eigengenes (GN18479, 18480, 18481) that summarize expression variance in this GO reference set. These eigengenes correlate well with femoral traits and with tibial trabecular traits (|r| between 0.50 and 0.70). But of much greater interest, these eigengenes can also be used to highlight novel genes that may also be linked to bone biology, more specifically, those that may fall within new QTL.
We extended this analysis to 34 major bone ontology terms, and then computed correlations between eigengenes and the entire transcriptome. We collected the -logP values of these correlation coefficients (Supplementary Data S2, see Bone Scores) for thousands of genes. Nearly 200 transcripts (more formally, probes) have average -logP bone scores >7.0 across all 34 GO terms. Surprisingly, among the top 20 with very high scores, >9, only three-Cd276, Bmp3, and Satb2-were already included in sets of bone GO genes. Another five-Col15a1, Unc5b, Fam78b, Dlx6, and Nkd-have been implicated in bone biology, but are not yet part of any bone-related GO. The remaining 12-P3h4, Unc5b, Srpx, C1orf198, Gxylt2, Clec11a, Sec31a, Bok, Kdelr3, Prss35, and Tmtc2-all with scores >9 have not, to the best of our knowledge, been linked with bone biology, and are therefore candidates worth more focused analysis and validation. These genes are examples of what we define as a bone ignorome.
We have used three gene data sets to define a bone ignorome (Fig. 4), with the goal of systematically highlighting positional candidates involved in the traits we have mapped. The average bone score of all protein-coding genes (blue line in Fig. 5) and of all 1344 positional candidate genes (yellow line) are generally low, and averages between these two sets are closely matched: 0.85 AE 0.01 and 0.88 AE 0.03, respectively. As expected, 770 genes already linked to bone GO terms had a much higher mean value of 1.74 AE 0.07 (green line). If a stringent threshold of 3 is chosen, then 2075 genes-the gray area between the blue and green lines (Fig. 5)-meet the criterion of being members of the bone ignorome. Not surprisingly, the percentages of genes with bone scores above 3 in these categories are 7.3%, 7.2%, and 25.2%, respectively (Fig. 5). At more stringent levels of 6 and 9, the numbers of genes above the criterion are 437 and 59, respectively.

Combined analysis of mouse candidate genes with QTL and GWAS hits in rat and human
At the level of candidate genes, only 12 out of 212 overlap those already curated in the RGD (Supplementary Data S3, Sheet 1638_sorted_212candidates_ > 4). Ten of these-Cadm1, E2f6, Fancc, Fcer1g, Fkbp10, Gja1, Hc, Ihh, Ptch1, and Smarcal1-are defined as having a role in bone biology in mice, one in rats (Cadm1), and two in humans (Grem2 and Ihh). RGD lists 295 bone QTL and 646 bone-associated candidate genes in mouse; 213 QTL and 269 genes in rat; and 70 QTL and 230 candidate genes in human. Collectively, in mouse roughly half of all chromosomes are covered by one or more of 295 QTL. We compared 16 of our prime QTL with those previously mapped in mouse. Fourteen are novel. Three define entirely new loci (labeled in red in Fig. 1C): Chr 6 (Ttda6) and Chr X (FcvfXa and FcvfXb). The other 11 overlap previously reported QTL, but all are μCT-derived structural traits not directly related to BMD; we therefore consider them novel (labeled in green in Fig. 1C), including Fttf1a, Tcv2, Fpmoif7, Ftsmoim9, Ttdaf9, Fpmoif10b, Ttsf11, Fcvf12, Tmoif13, Tct13, and Fmoif17. The only two QTL that have similar phenotypes and map positions, and that are therefore provisionally replicated are Fttf1b (55) and Fpmoif10a (56) (labeled in black in Fig. 1C). Finally, eight of our candidate genes are known to cause abnormal skeletal morphology when knocked out in mice: Greb1, Praf2, Timp1), Nr1d1, Thra, Tmem63a, Nsun2, and Sdhc had summary candidate scores >4.
Finally, we extended the analysis of candidates by aligning to human GWAS results. Eight of 12 candidates that we tested had relevant human hits (see Supplementary Data S3, Sheet 1638_sorted_212candidates_ > 4). Here we highlight just two.
(1) Grem2 is located on Chr 1 in both mouse and human and overlaps an association for BMD on Chr 1 spanning from 240.1 to 241.1 Mb. (5) Grem2 is a member of a family of bone morphogenic protein antagonists expressed in bone that has been linked to low BMD. (57,58) (2) Cadm1 is a cell adhesion molecule downregulated in many cancers. (59) Cadm1 in human overlaps an association for estimated BMD. In the context of osteosarcoma, Cadm1 is expressed on the osteoblast cell surface and is used as a marker of differentiation. (60) Additionally, Cadm1 plays a role in NFATc1 modulation of osteoclast activity. (61)

Synopsis
We used high-resolution μCT imaging to measure bone traits in both sexes of a large cohort of highly diverse strains of mice. We selected a subset of 25 traits from trabecular and cortical compartments of tibia and femur for genetic dissection in each sex across 50 to 61 members of the BXD family. These microstructural traits have heritabilities that range from 30% to 78% in both sexes. We successfully mapped 16 QTL-10 for femur and 6 for tibia-and we generated a list of 1638 candidate genes within 1.5 LOD CIs. Surprisingly, no QTL were shared between sexes, and we were far more successful in defining QTL for females than males, suggesting strong sex hormone and reproductive differences in bone genetic architecture (62) and in the modulation of bone microstructure. For these 16 QTL, we filtered and extracted the seven most consistent loci that we regard to be of highest interest. We nominated 50 genes with strong associations to skeletal homeostasis and that had high summary bone scores using a novel transcriptome-GO annotation strategy. Of this list, seven candidate genes have been linked with bone biology or abnormalities in human and animal models, including Adamts4, Ddr2, Darc, Adam12, Fkbp10, E2f6, and Adam17, whereas another four have been linked either in human (Grem2 and Greb1) or in vitro animal models (Ifi204 and Ifi202b). All are worth additional genetic and molecular studies to test their roles in bone biology and their expression patterns in osteoblasts and osteoclasts. Molecular and cellular functions of the remaining 39 genes are still largely unknown, particularly with respect to the skeletal system. Some have remarkably high bone scores and are therefore primary candidates, especially Ly9, Ifi205, Mgmt, F2rl1, and Iqgap2.
The significance of computing bone ignorome scores for candidate gene ranking We compared all 16 bone QTL with hundreds of rodent bone loci listed in the RGD. (63,64) Of these 16, 3 are completely novel: 1 on Chr 6 (Tcv2) and 2 loci on Chr X (FcvX and FcvfX). The other 13 overlapped known BMD loci, but only two actually had similar phenotypes and map positions: Fttf1b and Fpmoif10a. The other 10 were specifically linked to μCT bone traits and were therefore novel.
Given the large number of QTL and positional candidate genes we uncovered, we needed to develop efficient and objective methods to evaluate candidates and their potential role in bone biology. Of 1638 genes overlapping locations of QTL, only 36 (approximately 2%) have been linked to any major bone and skeletal system GO term (see Materials and Methods section). Major skeletal system GO terms currently only include approximately 360 of 24,495 genes in the mouse genome (approximately 1.5% of all coding and noncoding genes), and approximately 404 genes when the list is expanded to include rat and human genomes. RGD lists 652 genes associated with bone structure and function in mouse: roughly 2.7% of all genes. Even this higher value is likely to seriously underestimate the number of genes and the fraction of genome associated with the development, structure, function, and homeostasis of bone and the skeletal system.
An innovation of the present study is that we define a bone ignorome by computing a score using reference gene sets and patterns of gene coexpression. The mean score of well-known reference genes is 1.74. In comparison, many genes that currently have no known association with bone biology have much higher scores. We applied a reasonably stringent criterion and define genes with a score of 3.0 or higher as members of the bone ignorome. This approach generated a set of 2075 genes potentially associated with bone biology and the skeletal system (Fig. 5)-a value that we believe is more in line with the numbers of genes likely to have an important and possibly selective effect in bone biology. (2) Candidate gene ranking We reviewed 16 QTL to evaluate their replicability when using subsets of data with a narrow age range (65 to 116 days) without age correction (see traits GN 18986 to 19086). This reduced sample size by a third, but did not affect number of strains. As expected, mean linkage scores were reduced by the smaller sample size. Seven loci were insensitive to age as a confounder, and from these seven we nominated 50 candidate genes ( Table 5, selecting only those with high summary scores. Two of these are described below as examples of compelling candidates; however, this entire set is worth a further systematic analysis. Grem2 (gremlin 2, DAN family BMP antagonist) encodes a member of the BMP antagonist family and is a strong candidate for femur trabecular thickness at the Fttf1a locus on Chr 1. The effect of this gene on BMP signaling and osteoblast differentiation has been confirmed in in vitro studies. (65,66) In humans, genetic variants near Grem2 influence expression in osteoblasts and are associated with fracture risk. (5) Another study has reported that the minor allele of rs4454537 in Grem2 is associated with low BMD in the hips of a southern Chinese population. (57) Our findings suggest that the BXD family, females in particular, would be a good starting point to test genetic and molecular control of Grem2 and its possible modulation of trabecular thickness.
Greb1 is a robust candidate for cortical bone volume at the Fcvf12 locus on Chr 12. This locus also has a strong sex bias with an LRS of 19.5 in females, but only 1.1 in males. Greb1 is responsive to estrogen in breast tissue. (67) It is expressed in prostate, and its promoter contains potential androgen receptor-binding sites. (68,69) In humans, Greb1 is associated with BMD at two sites with high fracture rates: the femoral neck and lumbar spine. (70) However, the association of Greb1 with bone biology has not been reported previously in animal models. Because both estrogen and androgen are strong modulators of bone remodeling, it is plausible that Greb1 is a target for both further osteoporotic research and a target for the prevention and treatment of postmenopausal osteoporosis.
In addition to these two obviously strong candidates that already have been characterized in humans, the following genes that have both missense mutations and strong cis eQTL should be the focus of further molecular, experimental, and genetic studies: Mgmt, Mrps27, Fndc1, and Krt10. These genes are candidates for femur polar moment of inertia, tibia cortical thickness, femur moment of inertia around the longer axis, and tibia trabecular number, respectively.

Sex differences
Phenotypic differences of bone traits between sexes were large, but perhaps not surprising. Most values for males were greater than those for females. However, we were surprised by the marked sex imbalance in the yield of QTL. Although heritability of traits was roughly matched-means of 0.59 for females and 0.57 for males-the sex correlations across many traits and strains were at best modest, about 0.43 AE 0.02 SE. Males and females were often littermates, and all phases of phenotyping were carried out without batch processing by sex. Heritabilities did not predict whether a QTL was detected in one sex or the other, nor was the failure to detect QTL in males an artifact of the thresholds we used to declare mapping victory. For example, femur trabecular thickness in females was linked to two strong and independent QTL on Chr 1 with LRS scores of 16 and 20. This trait in males did not even reach a suggestive level anywhere in the genome, and had a peak LRS of merely 7 on Chr 1. We therefore believe that sex differences in mapping reflect underlying differences in genetic architecture and, of course, life history and reproductive roles. Traits in males may be controlled by larger numbers of loci with smaller effects or controlled to a much greater degree by undetected epistatic interactions. Traits in females are likely to be strongly coupled with reproduction and life history. Although sex-specific modulation of bone and many other traits in animal models and in human studies is accepted almost as a truism, (54,71,72) studies by Randall and colleagues (73) and Yang and colleagues (74) highlight the comparative rarity of sex-specific gene effects on stature and on BMI. In humans, only the waist:hip ratio is arguably dimorphic, and all seven replicable and significant loci listed by Randall and colleagues for this trait are detected strongly only in females (their Table 1 (73) ).

Site specificity
Over the past decade, more than 150 loci for bone-associated traits have been mapped in many mouse crosses. (75,76) Causal gene variants have been successfully defined for more than 10 of these, including Asxl1, Bbx, Cadm1, Cdh11, Fam73B, Prpsap2, Setdb1, Slc38a10, Spns2, Trim45, and Trpsl. (77)(78)(79) Most previous work 'over 120 of these loci' has involved only BMD, either of the whole body or bone compartments. In comparison to this simple composite measure, we have focused almost exclusively on μCT traits from deep phenotyping. But for reference to previous work, we have also included BMD measurements. Although BMD by DEXA is still the preferred way to screen and diagnose osteoporosis, it is an aggregate of both cortical and trabecular compartments. We measured three μCT-derived BMDs from whole bone, cortex, and trabecula in approximately 600 cases, but failed to map any associated loci. In contrast, we were able to detect loci for cortical and trabecular traits, particularly in females.
Remodeling and turnover of cortical and trabecular bones are differentially controlled and regulated. (80) Trabecular bone is composed of internal rods and plates, forming a lattice that is the primary repository of bone marrow. Because of its close proximity with marrow and marrow-derived cells, trabecular bone has a higher level of turnover than cortical bone. (81) Our genetic dissection of these two compartments confirms the distinction. We performed a correlation test, and representative trabecular parameters did not correlate with cortical bone or whole-bone parameters: bone fraction BV/TV, SMI, or trabecular number. However, individual trabecular and cortical sites did map well, suggesting that gene variants have relatively precise effects on specific regions and compartments.
There was no overlap among the six loci for trabecular bone and the 10 loci for cortical bone. Five of six trabecular loci had significant p values for sex-by-genotype interactions. In contrast, only two of the femur cortical loci had sex-by-genotype effects. Taken together, this confirms that trabecular and cortical microtraits are differentially and independently modulated. These findings underscore the importance of precision phenotyping in mapping and in experimental precision medicine. (82,83) Advantages and limitations This is one of the first genetic studies of bone microarchitecture in mouse using μCT. Our method provides a precise way to quantify and image microarchitecture in trabecular and cortical bone compartments with a resolution of 10 microns or less. One challenge of μCT is the large number of summary values generated per bone. We chose to evaluate and map 25 traits for each bone that are generally regarded of great biological interest. This was still a large number, and raised the issue of the study-wide falsepositive rates. As usual, all mapping had been corrected for genome-wide testing, but not corrected for numbers of traitsa total of 150 entered into GN. As a result, some loci were likely to be false discoveries. This was the main motivation for extracting only a core set of seven QTL that we regarded as robust in the sense that they are insensitive to variation in age, body weight, genotype file (old versus new), and mapping algorithm (Haley-Knott versus GEMMA), as well as the distribution and transformation of the phenotype (original data or winsorized data to minimize the impact of outliers). There is inevitably still some risk of false discoveries, but we regard these QTL to be strong enough to warrant independent validation using, for example, CRISPR-Cas9 engineering, pharmacological manipulation, or in-depth omics analyses. A straightforward alternative at this point would also be to extend the study using an independent panel of BXDs or their diallel progeny. There is an additional set of 80 BXD strains that have not been phenotyped at all. (21) One could also broaden the analysis to include other intercrosses (84) and Diversity Outbred stock. (85) Further limitations of this study should be highlighted. First, we used a relatively modest number of genomes (n = 50 to 63) and replicates within strain by sex (n = 5). We therefore cannot evaluate strain-specific sex effects. However, we were able to evaluate overall sex differences for all phenotypes and mapping results. Second, the sample size was still too small to detect epistatic interactions, but with the recent boost in the BXD family size to 150 strains, (21) we can now detect stronger two-locus interactions. (86) Third, by incorporating a global analysis of gene ontologies we were able to efficiently filter gene candidates. The GO terms that we selected are obviously not yet complete, and this leads to some false-negatives-many of which we hope we have highlighted using the bone ignorome scores. Fourth, many variants known to be involved in BMD in some human populations will not be detected in the BXD family, or for that matter, even in other human populations. This is principally because of differences in genetic architecture, frequencies of DNA variants, and key environmental factors. For example, a missense variant in LRP5 (rs3736228) that strongly influences BMD (9) has a significant lower minor allele frequency in African populations compared with other populations (https://gnomad. broadinstitute.org/). Is this well-known human variant influential in the BXD mouse family? To test its role in mouse we used a reverse genetic method. (20,87) The linkage of LRP5 to BXD bone phenotypes was modest (-logP of 2.46 with GN18279); for this reason, the murine Lrp5 gene did not enter our list of candidates. However, the phenome-wide association analysis (PheWAS) in mouse does illustrate how human candidates can be quickly reviewed using both www.genenetwork.org and a new PheWAS tool (www.systems-genetics.org).

Future directions
The long-term direction of this work is to transition from locus to gene to mechanism to potential preclinical therapy. The first step is to achieve high-quality quantitative measures relevant to bone strength and metabolism and to demonstrate a heritable control of variation. The second step is to demonstrate that single loci can be defined with sufficient precision to nominate strong candidate genes. Our work has reached the end of this second stage. What are the subsequent steps that will most efficiently validate candidates and test them as therapeutic targets? One approach would be to produce a multispecies meta-analysis of genes implicated in bone function. Those shared genes across species will be the most relevant candidates. Another possibility is to systematically study gene-by-environmental interactions; something far more efficiently handled using cohorts such as the BXDs in which identical genometypes in sex-matched pairs can be exposed to several environments or treatments. In addition, pharmacological and molecular methods are needed to validate candidates and mechanisms. Finally, we need to develop higher throughput ways to test therapeutic interventions starting at early stages and using rigorous quantitative methods: what we call experimental precision medicine. Again, cohorts of isogenic animals for which we have superb baseline data will be an essential resource to achieve this last goal, and to evaluate the impact of treatment as a function of genometype and environment.

Disclosure
The authors declare that they have no competing interests.