Exome Sequencing Reveals a Phenotype Modifying Variant in ZNF528 in Primary Osteoporosis With a COL1A2 Deletion

ABSTRACT We studied a family with severe primary osteoporosis carrying a heterozygous p.Arg8Phefs*14 deletion in COL1A2, leading to haploinsufficiency. Three affected individuals carried the mutation and presented nearly identical spinal fractures but lacked other typical features of either osteogenesis imperfecta or Ehlers‐Danlos syndrome. Although mutations leading to haploinsufficiency in COL1A2 are rare, mutations in COL1A1 that lead to less protein typically result in a milder phenotype. We hypothesized that other genetic factors may contribute to the severe phenotype in this family. We performed whole‐exome sequencing in five family members and identified in all three affected individuals a rare nonsense variant (c.1282C > T/p.Arg428*, rs150257846) in ZNF528. We studied the effect of the variant using qPCR and Western blot and its subcellular localization with immunofluorescence. Our results indicate production of a truncated ZNF528 protein that locates in the cell nucleus as per the wild‐type protein. ChIP and RNA sequencing analyses on ZNF528 and ZNF528‐c.1282C > T indicated that ZNF528 binding sites are linked to pathways and genes regulating bone morphology. Compared with the wild type, ZNF528‐c.1282C > T showed a global shift in genomic binding profile and pathway enrichment, possibly contributing to the pathophysiology of primary osteoporosis. We identified five putative target genes for ZNF528 and showed that the expression of these genes is altered in patient cells. In conclusion, the variant leads to expression of truncated ZNF528 and a global change of its genomic occupancy, which in turn may lead to altered expression of target genes. ZNF528 is a novel candidate gene for bone disorders and may function as a transcriptional regulator in pathways affecting bone morphology and contribute to the phenotype of primary osteoporosis in this family together with the COL1A2 deletion. © 2020 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).


Supplemental methods
Whole exome data analysis DNA samples from five family members (Supplemental Figure S1) were studied using whole exome sequencing. DNA was fragmented and enriched at the BGI (http://www.genomics.cn/en/index) by using NimbleGen SeqCap EZ Human Exome v3.0 kit. The enriched DNA was subjected to Illumina HiSeq sequencing. Clean reads were then aligned with the reference genome (hg19) and quality controlled according to the BGI Bioinformatics pipeline.
Single nucleotide variants (SNV) or small insertions or deletions (ins/dels) were detected (at least 50x coverage by reads, base call with Phred quality scores greater than 20) and filtered using Chipster.
Data was analyzed using BEDTools software. Only private or rare (minor allele frequency (MAF) < 0.01) variants that were shared by the affected individuals and not present in the unaffected individuals or in the in-house exome set (N=71) were studied further. These remaining variants were functionally annotated using wANNOVAR to identify variants estimated to be pathogenic by SIFT, Polyphen-2 and Mutation Taster. Variants without annotations were not excluded. Variants were validated using capillary sequencing with ABI3500xL Genetic Analyzer (Applied Biosystems) from all family members to find alleles co-segregating with the phenotype. The frequencies for cosegregating variants in the general Finnish population were obtained from the The Exome Aggregation Consortium browser (ExAC, https://exac.broadinstitute.org/) and the Genome Aggregation Database (gnomAD, https://gnomad.broadinstitute.org/).

Expression of the ZNF528 in vitro
ZNF528 pcDNA3.1 constructs. To study the effect of the ZNF528 c. 1282C>T/p.Arg428* variant two expression constructs were made: a wild type ZFF528 (V5-ZNF528-WT) and a construct with the identified variant (V5-ZNF528-c.1282C>T). Briefly, the wild type ZNF528 cDNA was commercially acquired (GenScript, Piscataway, NJ, USA) and the c. 1282C>T/p.Arg428* variant was introduced using QuickChange site directed mutagenesis kit (Stratagene, La Jolla, CA, USA). The cDNA was amplified using PCR, cut with XbaI and BamHI restriction enzymes and ligated into pcDNA3.1. (-) expression vector. V5-tag was introduced to the N-terminus. The constructs were amplified by transformation into XL1Blue E.coli strain and culturing overnight at 37°C in 100 ml of LB broth (1g/100ml NaCl, 1g/100ml Tryptone, 0.5g/100ml yeast extract) with thiamine, tetracycline and ampicillin. The constructs were extracted using QIAfilter maxi kit (Qiagen, Chatsworth, CA, USA) and verified by capillary sequencing. Culturing primary skin fibroblasts. Primary skin fibroblasts were cultured to obtain total RNA for qPCR analysis to study the expression of ZNF528 target genes in patient cells. Primary skin fibroblasts from two patients, and three age and gender matched unrelated controls were cultured at Data was analyzed using the 2(-Delta Delta C(T)) -method (39) . Beta-actin (ACTB) and beta-2microtubulin (B2M) were used as reference genes for HEK293 cells and hypoxanthine-guanine phosphoribosyltransferase 1 (HPRT1) and succinate dehydrogenase complex flavoprotein subunit A (SDHA) for primary skin fibroblasts.
Chemiluminescence signal was imaged using a LAS-3000 Luminescent Image Analyzer (FujiFilm, Tokyo, Japan). The protein bands were normalized to β-actin and Image J software (National Institute of Mental Health, Bethesda, MD) were used to quantify immunoblots.

Cell viability and cytotoxicity assay
The cells of different groups were trypsinized and seeded into 96 well culture plates with 2 x 10 3 each well. Cell viability was determined by using Cell Proliferation kit II (11465015001, Roche). We collected the data at the indicated time by measuring the absorbance at 450 nm according to the manufacturer's instructions. The cytotoxicity of cells were tested by CellTox Green Cytotoxicity Assay kit (G8743, Promega). The cells from each group were resuspended as 4 x 10 3 each well. The fluorescence was read with 485nm excitation source and 520nm emission filter. Data was obtained from triplicate wells and statistical significance was calculated by the two-tailed Student's t test with equal variances.

Chromatin immunoprecipitation (ChIP) sequencing
Saos-2 cells were cross linked in formaldehyde (final concentration one percent) for 10 minutes, and 125 mM glycine was used to stop the reaction. A hypotonic lysis buffer (10 mM KCl, 10% glycerol, 20 mM Tris-Cl, pH 8.0, 2 mM DTT, and a cOmplete EDTA-free protease inhibitor cocktail (Sigma-Aldrich, St Louis, MO, USA) was used to suspend cell pellets followed by incubation at 4°C for 50 minutes. The nuclei were suspended using SDS lysis buffer (50 mM Tris-HCl, 10 mM EDTA, pH

ChIP sequencing data analysis
Motif analysis and peak annotation. FastQC was used to assess the quality of raw sequence reads and Trimmomatic version 0.35 was used to remove adaptors and trim reads. Processed reads were next aligned to the human genome (hg19) using BWA mapper. The creation of tag directories, peak calling and motif analysis were performed using HOMER. The peak regions across the genome hg19 were annotated and peak-associated genes and differentially expressed genes from RNA sequencing for V5-ZNF528-WT, V5-ZNF528-c.1282C>T were intersected.
Gene-Ontology enrichment analyses. Functional enrichment analysis was performed using HOMER.
The analysis of distribution of ChIP sequencing peaks across the genome was performed using R package "ChIPseeker" version v1.18.0.
A total of 2260 and 29190 unique peaks for the V5-ZNF528-WT and V5-ZNF528-c.1282C>T respectively, were obtained after excluding 19 common binding sites. Genomic Regions Enrichment of Annotations Tool (GREAT) was employed to predict functions of cis-regulatory regions for ZNF528-WT ChIP-Seq and ZNF528-c.1282C>T ChIP-Seq unique binding sites. Gene regulatory domain was defined with the parameter "single nearest gene" and background regions was selected as "whole genome". For large data sets, the option "significant By Region-based Binomial" that applies a binomial test over genomic regions was further selected.
In addition, re-analysis was performed for the ChIP sequencing data of wild type ZNF528 that is publicly available (http://zifrc. ccbr.utoronto.ca/) and reported earlier (34) . The downloaded data included three different motif data sets: 1) the ZNF528 motifs predicted by bacterial one-hybrid recognition code (B1H-RC motifs), containing 3367 motifs; 2) the de novo motifs identified by MEME, containing 11 841 motifs and 3) the de novo motifs most similar to the B1H-RC motifs trimmed based on their alignment with the B1H-RC motifs (De Novo Similar to B1H-RC Trimmed Motifs), containing 8 009 motifs.
To investigate whether the ChIP sequencing peaks are linked to genes with any functional annotations, ontology analysis was performed using the GREAT. An enrichment analysis was carried out for Gene ontology biological process (GO BP) terms and Mouse and Human phenotypes was performed. Amongst these pathways, genes that had the ZNF528 binding site and had a previously known connection to bone phenotypes were identified. These genes were considered as potential ZNF528 target genes for further analyses. The FASTQ data was prepared within BaseSpace (Illumina, San Diego, CA, USA).

Differential gene expression analyses
DESeq2 version 1.22.2 was applied to perform differential gene expression analysis in R (version 3.5.2). For calling differentially expressed genes the false discovery rate (FDR) cutoff was set to < 0.01. The read counts were normalized with shifted logarithm transformation for plotting heatmaps using R package pheatmap (version 1.0.12).
Functional enrichment analysis using WikiPathways database was performed through HOMER software. We performed the gene overlapping analysis for genes from ZNF528 WT and ZNF528c.1282C>T obtained from the overlap of RNA-Seq differentially expressed genes and ChIP-Seq peak annotated genes. Common overlapping genes were excluded and unique genes of ZNF528 WT and ZNF528-c.1282C>T were then converted to Entrez Gene IDs and followed by the submission to HOMER "findGO.pl" function for the assessment of the enrichment of ontology categories. The "bg" parameter of the "findGO.pl" was set as default.
We checked the common overlapping genes between ZNF528 WT or ZNF528-c.1282C>T with osteoporosis-related genes from OsteoporosAtlas (http://biokb.ncpsb.org/osteoporosis/), which is a manually curated database for human osteoporosis-related genes. We then checked the statistical enrichment of osteoporosis-related genes in the ZNF528 WT or ZNF528-c.1282C>T unique gene lists using Pearson's Chi-squared test in R (V 3.6.2).

Supplemental Figures
Supplemental Figure S1 Pedigree of the family with primary osteoporosis. Black symbols represent affected individuals. The individuals included in the whole exome sequencing are marked with asterisk.
16 Supplemental Figure S6 V5-ZNF528-WT and V5-ZNF528-c.1282C>T expression in nucleus and cytoplasm was observed in HEK293 cells using the Western blot method.
The uncropped Western blots are presented in Supplemental Figures S7-S8.

Supplemental Figure S9
Absorbance measurements of the cell viability assay. Absorbance was measured at 450 nm.

Empty vector
Results of the cell cytotoxicity assay. The fluorescence was read with 485nm excitation source and 520nm emission filter. Data was obtained from triplicate wells and statistical significance was calculated by the two-tailed Student's t test.
Mouse phenotypes' ontology category contains data of mouse genotype-phenotype associations, and the x-axis values (in logarithmic scale) corresponds to the binomial raw (uncorrected) p-values.

Supplemental Figure S12
Pathway enrichment analysis for ZNF528-WT unique target genes and statistical significance for osteoporosis genes. (A) Wikipathways enrichment analysis for ZNF528-WT targeted unique genes. Chi-square test.

Supplemental Tables
Supplemental Table S1 Summary of the clinical information.  Table S5 Genes identified in the pathway enrichment analysis for RNA-seq and ChIP-seq data (related to Supplemental Figure 12S B).