The human Y and inactive X chromosomes similarly modulate autosomal gene expression

Summary Somatic cells of human males and females have 45 chromosomes in common, including the “active” X chromosome. In males the 46th chromosome is a Y; in females it is an “inactive” X (Xi). Through linear modeling of autosomal gene expression in cells from individuals with zero to three Xi and zero to four Y chromosomes, we found that Xi and Y impact autosomal expression broadly and with remarkably similar effects. Studying sex chromosome structural anomalies, promoters of Xi- and Y-responsive genes, and CRISPR inhibition, we traced part of this shared effect to homologous transcription factors—ZFX and ZFY—encoded by Chr X and Y. This demonstrates sex-shared mechanisms by which Xi and Y modulate autosomal expression. Combined with earlier analyses of sex-linked gene expression, our studies show that 21% of all genes expressed in lymphoblastoid cells or fibroblasts change expression significantly in response to Xi or Y chromosomes.


In brief
Using human cells with variation in sex chromosome copy number, San Roman et al. find widespread transcriptomic impacts of the inactive X (Xi) and Y chromosome on autosomes.While Xi elicited larger gene-by-gene effects, Y's effects were largely concordant.They demonstrate that ZFX and ZFY, a pair of conserved transcription factors encoded by the X and Y chromosomes, mediate much of this shared genome-wide transcriptional program.Correlated response to Xi and Y copy number ) s t n u o c (

INTRODUCTION
The human X and Y chromosomes evolved from ordinary autosomes over the course of the last 200 million years. 1,2Today, these homologous chromosomes-typically two X chromosomes in females, and one X and one Y chromosome in males-comprise the oldest, most massive variation in the human genome.
Until recently, this chromosome-scale variation was understood to be of little direct consequence outside the reproductive tract, where the decisive roles of sex chromosome constitution are well established.The reasoning was straightforward: the second X chromosome in 46,XX cells is condensed and transcriptionally attenuated through the process of X chromosome inactivation (XCI), 3,4 and the Y chromosome carries relatively few genes, many of which are not expressed outside the testes. 5,63][14] Still, it is not known whether these differences in autosomal transcriptomes are driven by sex chromosome constitution, sex steroid hormones, or other biological, behavioral, or environmental factors.
Diploid human somatic cells invariably have one active X chromosome (Xa); most also have an inactive X (Xi) or a Y chromosome, and some have additional Xi or Y chromosomes.We previously modeled expression of X-or Y-linked genes as a function of Xi or Y copy number in cells cultured from individuals with various sex chromosome constitutions: one to four copies of chromosome (Chr) X (zero to three copies of Xi), and zero to four copies of Chr Y. 15 Expression of 38% of X-linked genes changed with additional copies of Xi, and expression of nearly all Y-linked genes increased with additional copies of Chr Y.These Xi-and Y-driven changes were remarkably modular, with successive Xi or Y chromosomes having quantitatively similar effects, facilitating linear modeling.By incorporating allele-specific analyses, we determined that the Chr X gene expression changes are due to a combination of (1) expression from Xi alleles and (2) modulation of Xa transcript levels by Xi, in trans.
Given that Xi modulates Xa genes in trans, the question arises whether Xi also modulates expression of autosomal genes, potentially via similar mechanisms.A related question is whether Chr Y modulates autosomal gene expression.Based on our recent success in linear modeling of sex chromosome gene expression, 15 we now adapt and expand that modeling to measure how variation in the number of X or Y chromosomes affects autosomal transcription in cultured human cells.We quantified the impact of variation in both Xi and Y chromosome copy number on autosomal transcripts, explored mechanisms responsible for those effects, and discovered that a conserved pair of transcription factors expressed from Xi and Y contribute to modulation of gene expression on autosomes and Xa.An unanticipated finding is that the autosomal impacts of Xi and Y chromosomes are remarkably similar.

Thousands of autosomal genes respond to Chr X and Chr Y copy number
We examined the effects of Chr X and Chr Y copy number on autosomal gene expression using our recently reported RNA-seq dataset from 106 lymphoblastoid cell lines (LCLs) and 99 primary dermal fibroblast cultures. 15These lines and cultures derive from 176 individuals with one to four X chromosomes and zero to four Y chromosomes (Figure 1A, Table S1).Previously, we reported that each Chr X or Y copy contributes an equal increment of expression for 38% of Chr X genes and nearly all Chr Y genes, which are well-fit with linear models. 15For each expressed autosomal gene, including protein-coding and long non-coding RNA genes with at least one transcript per million in either 46,XX or 46,XY samples, we fit a linear regression model in DESeq2 16 to estimate the log 2 fold change in expression per copy of Chr X or Y (STAR Methods).
This analysis revealed that Chr X and Chr Y copy number had widespread effects on autosomal expression.The magnitudes of individual gene expression changes per Chr X or Y were modest-with most changes less than 1.5-fold-but many genes were affected.There were 1,993 significantly (adjusted p value [P adj ]<0.05)X-responsive genes in LCLs-comprising 18% of expressed autosomal genes-and 606 in fibroblasts-5% of expressed autosomal genes (Figures 1B-1D; full results in Table S2).Chr Y copy number had a statistically significant effect on substantially fewer autosomal genes: 662 Chr Y-responsive genes in LCLs-6% of expressed autosomal genes-and 226 in fibroblasts-1.9% of expressed autosomal genes (Figures 1E-1G, Table S2).Gene ontology analyses of the Chr X-or Y-responsive genes yielded few significantly enriched functional categories (STAR Methods, Figure S1).We conclude that variation in Chr X and Y copy numbers significantly and broadly alters autosomal gene expression.
We considered and tested various interpretations of these findings.First, we asked whether X-or Y-responsive autosomal genes were statistical artifacts of enhanced sex-chromosomal expression altering the genome-wide distribution of sampled read counts.To this end, we renormalized the data and refit the same linear models using only autosomal genes, obtaining virtually identical results (STAR Methods, Figure S2).
Second, we tested whether there was any impact of modeling gene expression as a function of both variables-Chr X copy number and Chr Y copy number-together compared with models in which we used subsets of samples to vary either Chr X or Chr Y copy number individually.We found that the full model was highly correlated genome-wide with models of the individual Figure 1.Genome-wide response to Chr X or Y copy number is distinct in two cell types (A) RNA-seq data from LCLs and fibroblasts spanning a range of Chr X and Y copy numbers were analyzed using linear regression, modeling log 2 expression as a function of Chr X copy number, Chr Y copy number, and batch.Number of samples per karyotype in LCLs: 17   contributions of Chr X and Y copy number (Figure S3).Moreover, by conducting the Chr X analysis in samples with and without Y chromosomes, we found that the autosomal effects of Xi copy number are correlated in cells derived from anatomically male and female individuals (Figure S3).Third, we tested the impact of sample size on the number of significantly Chr X-or Y-responsive genes detected in our analysis to ask whether we have saturated the analysis with our current sample set.Using a bootstrapping analysis (STAR Methods), we found that sequencing cell lines from additional individuals would lead to discovery of more sex chromosomeresponsive genes, but they would likely have small effect sizes (Figure S4).
Finally, because individuals across our dataset have differing hormonal profiles that might be correlated with Chr X or Y copy number, we evaluated whether a memory of differential hormonal exposures in the body prior to tissue sampling might explain the results.First, we examined the expression levels of estrogen receptors alpha (ESR1) and beta (ESR2) and the androgen receptor (AR), finding low levels of expression in both LCLs and fibroblasts; only ESR2 in LCLs and AR in fibroblasts exceeded the 1 transcript per million (TPM) minimum expression cutoff, with median TPM of 1.5 and 2.3, respectively, across samples.Next, we compared autosomal genes that significantly respond to Chr X or Y copy number with direct target genes for estrogen or androgen receptors.Most X-or Y-responsive genes did not overlap with estrogen or androgen receptor direct target genes (STAR Methods, Table S3, Figure S5). 17,18There was a modest enrichment of androgen receptor targets among Y-responsive genes in fibroblasts, but their effects showed no evidence of correlation (Spearman r = 0.034, p = 0.9; Figure S5D).Thus, we did not find evidence that sex hormones explain the observed impacts of Chr X or Y copy number.In sum, thousands of autosomal genes respond significantly to Chr X or Y copy number, with more genes responding to X than to Y, and more in LCLs than in fibroblasts.

Response to Chr X or Chr Y copy number is mostly cell type specific
We investigated whether individual autosomal genes responded to Chr X or Chr Y copy number in LCLs only, in fibroblasts only, or in both cell types.The majority of significantly responsive autosomal genes were responsive in only one of the two cell types; some of these genes were ''cell-type-specific''-that is, they are only expressed in only one of the two cell types-and others were expressed in both cell types, but only responsive in one.Among the significantly Chr X-or Y-responsive genes in LCLs, 93% and 94%, respectively, were not significantly responsive in fibroblasts; among significantly Chr X-or Y-responsive genes in fibroblasts, 76% and 83% were not significantly responsive in LCLs (Figure 1H).Indeed, across autosomal genes expressed in both cell types, we observed only a weak correlation between responses to Chr X and to Chr Y (Figure S6).While most significantly responsive genes were responsive in only one of the two cell types, the overlap was greater than expected by chance (Figure 1H), and most of these overlapping genes responded with the same directionality-rising (or falling) in response to Chr X or Y copy number in both LCLs and fibroblasts (Figure 1I).Thus, a minority of the genome-wide response to sex chromosome copy number is shared between the two cell types; the majority is cell-type-specific.
Chr Xi and Y elicit similar responses across autosomes and Xa Next, we compared, within each cell type, the autosomal responses to Chr X and to Chr Y.Among autosomal genes with statistically significant responses to Chr X or Y copy number, we found that 22% of X-responsive genes and >60% of Y-responsive genes were shared-442 genes in LCLs and 136 in fibroblasts (Figure 2A).These co-regulated genes' responses to X and Y were highly concordant in polarity and magnitude (Pearson correlation in LCLs: 0.95 and fibroblasts: 0.96), suggesting that the mechanisms governing these responses might be shared (Figure 2B; see example genes in Figures 1B and  1E).Given the remarkable correlation of responses to Chr X and Y copy number for these shared genes, we further investigated the gene-by-gene effects of Chr X and Y on the ''Chr X-responsive-only'' autosomal genes.Interestingly, we observed a strong correlation of Chr X and Y responses for these genes (LCLs: 0.79, fibroblasts: 0.88; Figure S7).This highly correlated response to Chr X and Y could even be observed when assessing all expressed genes (LCLs: 0.67, fibroblasts: 0.6), with significantly larger effects in response to Chr X compared with Chr Y (Figure 2C).These results indicate that additional copies of Chr Xi and Y modulate a similar group of genes, with Xi exerting larger gene-by-gene effects than Y.
In a previous study, we showed that Xi copy number modulates gene expression from Xa in trans. 15In light of our current findings, we wondered whether Chr X and Chr Y copy number might have shared effects on Xa gene expression.0][21][22] Using these annotations, we found that slightly larger proportions of Xa-expressed genes were impacted by Chr X or Y copy number than on the autosomes: 57 (20%) and 27 (8.6%)Xa-expressed genes were significantly responsive to Chr X copy number in LCLs and fibroblasts, respectively, and 23 (8.2%) and 10 (3.2%) were responsive to Chr Y copy number (Table S4).Among these, 12 genes in LCLs and seven genes in fibroblasts significantly responded to both Chr X and Y copy numbers (Figure 2D).Like the autosomal genes, the responses of Xa-expressed genes to Chr X and Chr Y copy number were highly concordant in magnitude and polarity (Figures 2E and S7B).This suggests that Xi and Y modulate expression of a defined subset of genes on Xa in trans through shared mechanisms.
To identify the mechanisms driving the genome-wide shared response to Chr X and Chr Y copy numbers, we considered and tested three hypotheses.First, we asked whether the observed expression changes reflect a generic response to aneuploidy that is not specific to the sex chromosomes (and has been reported for autosomes 23 ).Second, we tested whether a ''heterochromatin sink'' created by additional sex chromosomes pulls heterochromatin factors away from and thereby de-represses (activates) autosomal genes.Third, we investigated the actions of regulatory genes shared between the X and Y chromosomes.We will consider each of these hypotheses in turn.
The shared response to Chr X and Chr Y copy number is not a generic response to aneuploidy To test the first hypothesis, we analyzed RNA-seq data from LCLs with three copies of Chr 21 (47,XX+21 and 47,XY+21) and compared these with our data from 46,XX and 46,XY LCLs (Figure 3A, Table S1). 15We employed linear modeling to estimate the log 2 fold change in expression per additional Chr 21, controlling for sex chromosome constitution (46,XX or 46,XY) (STAR Methods).Chr 21 had large effects across the genome: in addition to expression increases for Chr 21 genes (previously reported in San Roman et al. 15 ), we identified 980 genes on other chromosomes (including X and Y) that were significantly Chr 21-responsive (Figures 3B-3D, Table S5).
We then compared the autosomal responses to Chr 21 copy number with responses to Chr X or Chr Y copy number.Across the autosomes, Chr 21-responsive genes displayed significantly larger fold changes compared with Chr X-or Y-responsive genes (Figures 3E-3G).Importantly, most X-or Y-responsive autosomal genes did not overlap with the Chr 21-responsive gene set (Figure 3H).There was a modest overlap between Chr 21-responsive and Chr X-responsive genes, but the effects of Chr 21 and Chr X were weakly correlated (Figure 3I).Moreover, none of the Xa-expressed genes that responded to both Chr X and Y copy number in LCLs were among the Chr 21-responsive Chr X genes (Figure S8).We conclude that features unique to the sex chromosomes, rather than a generic aneuploidy response, explain autosomal (and Xa) gene expression changes in response to Chr X and Y copy number.
The response to Chr Y copy number is not due to heterochromatin sinks ''Heterochromatin sink'' effects are well documented in Drosophila, where Y chromosomes with varying amounts of heterochromatin affect expression of autosomal genes adjacent to heterochromatic regions-a phenomenon known as position-effect variegation. 24The human Chr Y has a large heterochromatic region on its long arm (Yq); in theory, the increase in heterochromatin with additional copies of Chr Y could pull heterochromatin factors away from autosomal genes, thereby increasing their expression.We studied two kinds of human variation to test whether Chr Y heterochromatin impacts autosomal gene  (legend continued on next page) expression: (1) common polymorphisms in Yq heterochromatin length, 25 and (2) rare structurally variant Y chromosomes.
To quantify Yq heterochromatin length, we mapped wholegenome sequencing data from 1,225 male samples from the 1000 Genomes Project 26 to DYZ1, a repetitive sequence specific to the Yq heterochromatic region. 6The average depth of coverage of the DYZ1 sequence, which reflects the number of repeats and size of the Yq heterochromatic region, ranged from 23.8 to 896.5 reads (after normalization to depth of coverage of a single-copy region of the Y chromosome), with a median of 202.5 (Figure S9; Table S6).For 194 samples for which RNAseq data were also available, 27 we modeled gene expression as a function of heterochromatin length as approximated by DYZ1 read depth, finding only three significantly differentially expressed genes (Figures 4A and 4B).We conclude that variation in Yq heterochromatin length does not affect genome-wide expression.
We next analyzed RNA-seq data from individuals with an intact X chromosome and a structurally variant Y ''isochromosome'' in which Yq heterochromatin is deleted and a portion of the chromosome, including widely expressed Chr Y genes on the short arm (Yp), is duplicated in ''mirror image'' orientation, resulting in higher Chr Y gene expression (Figures 4C, S10, and Table S1).If Yq heterochromatin contributes significantly to expression of shared X-and Y-responsive autosomal genes, we would expect cells missing this region to show expression like that of 45,X cells, which also lack Yq heterochromatin.Using principal-component analysis of the 442 autosomal genes responsive to both Chr X and Y copy number in LCLs, we found that samples with variant Y chromosomes did not cluster with 45,X samples but were instead more similar to 47,XYY samples (Figure 4D).Together, these experiments indicate that a heterochromatin sink does not explain the response of autosomal genes to Chr Y copy number, and therefore cannot explain the shared response to Chr X and Chr Y copy number.
NPX-NPY gene pairs drive shared autosomal response to X and Y copy number Two groups of genes common to Chr X and Y could drive the shared autosomal response (Figure 5A).First are genes in the pseudoautosomal region (PAR), a region of identity on the distal short arms of Chr X and Y; expression of these genes increases linearly with sex chromosome copy number. 15Second are homologous gene pairs in the non-pseudoautosomal regions of Chr X (NPX) and Y (NPY), which increase in expression with X or Y copy number, respectively.These NPX-NPY gene pairs are remnants of the $200-million-year evolutionary process that differentiated a pair of ordinary autosomes into Chr X, which retains 98% of the ancestral gene content, and Chr Y, which retains only 3%.The NPX-NPY pairs are involved in important cellular processes, including transcription, epigenetic regulation, and translation, 10 and have diverged in sequence to varying degrees, 6 possibly maintaining identical functions.Of the NPX-NPY pair genes, four are of particular interest because they encode proteins that affect genome-wide expression: the histone lysine demethylases KDM6A/UTY and KDM5C/KDM5D, RNA helicases DDX3X/DDX3Y, and transcription factors ZFX/ZFY.Since PAR and NPX-NPY pair gene copy numbers are usually correlated, we decoupled their effects using cell lines from individuals with structurally variant X or Y chromosomes in which PAR or NPX-NPY genes are deleted or duplicated.
We performed RNA-seq on four individuals who have, in addition to one structurally normal X chromosome, an X isochromosome in which some or all of the short arm (Xp) is deleted and the long arm (Xq) is duplicated in ''mirror image'' orientation (Table S1).Three different isochromosome structures with breakpoints in distal Xp (iXq-dist), or in proximal Xp (iXq-prox), or at the centromere (iXq-cen) were identified by fluorescent in situ hybridization (FISH) and confirmed by RNA-seq read counts (Figures S11-S13).The X isochromosomes are all missing PAR1, but they differ with respect to copy numbers of the four NPX-NPY genes of interest: iXq-dist retains them, iXq-prox is missing three (ZFX, DDX3X, and KDM6A), and iXq-cen is missing all four (Figures 5B and 5C; see Table S7 for copy numbers of PAR1 and NPX-NPY pair genes in cell lines with structural variants).
We also assessed gene expression in cells from individuals with one normal Chr X and an X-Y translocation product arising from recombination between highly similar genes in NPX and NPY, either PRKX and PRKY or ANOS1 (KAL1) and ANOS2P (KALP) (Figure 5D). 28We identified these translocations using a PCR-based screening strategy, and we confirmed that RNAseq read counts were consistent with the identified chromosomal breakpoints (Figures S12, S14, S15, and Table S7).Three individuals had PRKX-PRKY translocated chromosomes that retained PAR1 but lacked distal Yp, resulting in loss of ZFY.Four individuals had ANOS1-ANOS2P translocated chromosomes that lacked PAR1 but retained most NPX-NPY gene pairs (Table S7).
To determine if the shared autosomal response to X and Y copy number is due to PAR1 or NPX-NPY genes, we compared gene expression in the cell lines with structurally variant sex chromosomes to those with sex chromosome aneuploidy, using principal-component analysis of the shared Chr X-and Y-responsive autosomal genes in LCLs (442 genes) or fibroblasts (136 genes).If PAR1 genes drive the X-Y-shared autosomal response, we would expect all 46,X,i(Xq) and 46,X,t(X; Y)(ANOS1-ANOS2P) samples-which retain one copy of PAR1-to cluster with 45,X samples, and 46,X,t(X; Y)(PRKX-PRKY) samples-which retain two copies of PAR1-to cluster with 46,XX and 46,XY samples.Instead, we found that the clustering of 46,X,i(Xq) samples depended on the copy number of NPX genes on Xp.For example, the 46,X,i(Xq-cen) and 46,X,i(Xq-prox) samples-with one copy of most or all NPX-NPY pair genes on Xp-clustered with the 45,X samples, while the 46,X,i(Xq-dist) sample-with three copies of most NPX-NPY pairs-clustered with the 46,XX samples (one might expect this sample to cluster with 47,XXX samples, but, due to mosaicism for a 45,X cell line [STAR Methods], expression averages to 46,XX levels; Figures 5E and 5F).Similarly, 46,X,t(X; Y)(ANOS1-ANOS2P) samples-with two copies of most NPX-NPY pairs-clustered with 46,XX samples, while 46,X,t(X; Y)(PRKX-PRKY) samples-with reduced copy number of only two NPX-NPY pairs: ZFX/ZFY and RPS4X/RPS4Y-clustered with 45,X samples (Figure 5G).In all, these results indicate that NPX-NPY pairs, not PAR1 genes, drive the shared autosomal response to X and Y copy number in LCLs and fibroblasts.Among NPX-NPY pairs, only one pair maps to Xp and has one less copy in 46,X,t(X; Y)(PRKX-PRKY) samples: ZFX and ZFY.

ZFX and ZFY activate a shared transcriptional program genome-wide
To determine whether ZFX and ZFY mediate the genome-wide response to X or Y copy number, we searched for DNA motifs in promoters of X-or Y-responsive genes (STAR Methods).Motifs matching previously characterized ZFX DNA-binding signatures were enriched in genes that were positively X-or Y-responsive but not among genes that were negatively X-or Y-responsive; the highest enrichment was among genes positively responsive to both X and Y in LCLs (Table S8).][31] Next, we assessed functional evidence that ZFX, and possibly ZFY, act at X-or Y-responsive genes.Reanalysis of ENCODE data revealed that ZFX protein binding is enriched at promoters of genes positively responsive to X and Y in LCLs and fibroblasts (Figure S16, Table S9, STAR Methods).Moreover, published target genes activated by ZFX in three cell lines (MCF-7 breast cancer cells, C4-2B prostate cancer cells, and human embryonic kidney [HEK 293T] cells) were significantly enriched among genes positively, but not negatively, responsive to X and/or Y dosage in LCLs and fibroblasts (Figure S17; Table S10, STAR Methods). 31,32This aligns well with studies showing that ZFX is a transcriptional activator [32][33][34][35] and strengthens previous evidence of ZFX motif enrichment in genes positively regulated by Chr X copy number identified using microarray data. 36,37Since positively Y-responsive genes were enriched for ZFX targets, we hypothesized that ZFX's gene-activating functions were retained by ZFY during Y chromosome evolution.
To directly assess the functions of ZFX and ZFY genome-wide, we knocked down ZFX in 46,XX fibroblasts (from three individuals) and ZFX and/or ZFY in 46,XY fibroblasts (from three individuals) using CRISPR interference (CRISPRi 38 ; Figure 6A, STAR Methods), followed by RNA-sequencing.To identify differentially  expressed genes, we modeled gene expression as a function of the CRISPRi target gene (ZFX and/or ZFY versus control; STAR Methods).These CRISPRi knockdowns were effective: in single knockdowns we observed an $80% reduction of ZFX and >90% reduction of ZFY levels (Figure 6B).The knockdowns were specific: ZFX expression was not affected by ZFY knockdown, and vice versa, which suggests that ZFX and ZFY are not strong transcriptional regulators of each other.In a separate experiment, double knockdowns were performed resulting in more modest but significant reductions of both ZFX ($40%) and ZFY ($55%).
The breadth of the genome-wide effects varied among the knockdowns: in 46,XX cells, 4,638 genes were differentially expressed upon ZFX knockdown, while in 46,XY cells, 943 genes were differentially expressed upon ZFX knockdown, 206 genes upon ZFY knockdown, and 1,698 genes upon ZFX/ZFY double knockdown; in most cases, gene expression levels changed less than 2-fold (Figures 6C-6F, Table S11).The knockdowns confirm that ZFX and ZFY are transcriptional activators: genes that decrease in expression with ZFX and/or ZFY knockdown in 46,XX and 46,XY cells were enriched for ZFX DNA-binding motifs and protein binding at their promoters, compared with genes that were not significantly affected; conversely, genes that increased in expression were depleted for ZFX binding (Figure S18, Table S12).
Several lines of evidence indicate that ZFX and ZFY act in a mutually and cumulatively dose-dependent fashion.First, the magnitudes of the genome-wide effects in 46,XX and 46,XY cells were inversely related to the cumulative level of ZFX+ZFY expression remaining after knockdown (Figure 6G).Second, although there were fewer genes that significantly responded to ZFY knockdown compared with ZFX knockdown in 46,XY cells, 135 (69% of) ZFY-responsive genes also responded to ZFX knockdown, and these responses were nearly perfectly correlated, with larger responses to ZFX than to ZFY knockdown (p = 7.2 3 10 À8 , paired t test; Figures 6H and 6I).Third, even among genes that were significantly responsive to either ZFX or ZFY knockdown (ZFX ''only'' or ZFY ''only''), gene-by-gene responses to ZFX and ZFY knockdowns were highly correlated (Pearson correlations for ZFX ''only'': 0.87; ZFY ''only'': 0.92; Figure S19).Finally, all Xa-expressed genes that significantly responded to ZFY knockdown also responded to ZFX knockdown in 46,XX and 46,XY cells; even the Xa-expressed genes that were only significantly responsive to ZFX knockdown displayed correlated responses to ZFY knockdown (Figure S20).We conclude that ZFX and ZFY activate the same genes, with ZFX being more potent than ZFY, and that the total levels of ZFX+ZFY impact thousands of genes across the genome.
ZFX and ZFY are required for shared response to Chr X and Chr Y copy number Finally, we asked whether ZFX or ZFY activity is required for the genome-wide response to Chr X and Chr Y copy number.Of the 136 autosomal genes responsive to both Chr X and Y copy number in fibroblasts, 109 (81%) responded to ZFX and/or ZFY knockdown, a significant enrichment (Figure 7A).Among genes that responded significantly ''only'' to Chr X or ''only'' to Chr Y, 70% and 57%, respectively, were ZFX or ZFY responsive, again suggesting that ZFX and ZFY have similar genome-wide targets (Figure S21A).On a gene-by-gene basis, the effects of ZFX or ZFY knockdown were negatively correlated with the response to Chr X or Chr Y copy number, consistent with ZFX and ZFY's functions as transcriptional activators (Figures 7B and S21B).In all, the transcriptional changes upon ZFX or ZFY knockdown explained 19%-40% of the effects for autosomal genes responsive to either Chr X or Y copy number, and 46%-57% of the effects for autosomal genes responsive to both Chr X and Y copy number (Figures 7B and S21).
We obtained similar results for genes that are only expressed from Xa.Of the seven Xa-expressed genes significantly responsive to Chr X or Y number in fibroblasts, all were significantly responsive to ZFX and/or ZFY knockdown (Figure 7C).For these seven Xa-expressed genes, the effects of ZFX and/or ZFY knockdown explain 78%-92% of the effects of Chr X and Y copy number (Figure 7D).In sum, the actions of ZFX and ZFY explain most of the shared response to Chr X and Chr Y copy number on Xa as well as on autosomes.

DISCUSSION
Traditionally, the ''gene-poor'' Y and ''inactive'' X (Xi) chromosomes have not been considered prominent players in regulating the transcriptome of human somatic cells.Studies of these ''sex chromosomes'' have focused instead on their capacity to differentiate the reproductive cells and organs of males and females.Our study produced two surprising findings that challenge common understandings of these chromosomes.First, the Xi and Y chromosomes modulate expression of thousands of autosomal genes in somatic cells, as we demonstrated in transformed (lymphoblastoid) cells and primary cultures (skin fibroblasts).Second, the genome-wide impact of these two chromosomes is strikingly similar: many autosomal genes respond to Xi and Y by changing expression in the same direction and with correlated effect sizes.These shared effects do not reflect a generic aneuploidy response, hormonal responses, or a response to a heterochromatin sink, and they are not driven by pseudoautosomal genes.Instead, a pair of NPX-and NPY-encoded transcriptional activators, ZFX and ZFY, account for about half of the shared effects.
Twenty-one percent of autosomal genes expressed in LCLs or fibroblasts respond to Chr X or Y copy number We began this study by quantitatively modeling autosomal gene expression in cells of two types cultured from 176 individuals (D) Schematic showing recombination between X and Y Chrs at PRKX and PRKY, or at ANOS1 and ANOS2P, resulting in X-Y translocation products.(E and F) Principal-component analysis (PCA) of autosomal X-and Y-responsive genes in LCL (E) or fibroblast (F) samples with one to three copies of Chr X or 46,X,i(Xq) structural variants.(G) PCA of autosomal X-and Y-responsive genes in LCL samples with one to five total sex chromosomes or with X-Y translocated chromosomes.Ellipses represent 95% confidence intervals around the centroid of each karyotype group with at least three samples.with a wide range of sex chromosome constitutions: one to four X chromosomes (zero to three Xi chromosomes), and zero to four Y chromosomes (e.g., 45,X to 49,XXXXY and 49,XYYYY).[39][40][41][42][43][44] Our linear model provided the power required to detect and quantify increases or decreases in expression of individual autosomal genes as a function of Chr X copy number, and as a function of Chr Y copy number.Eighteen percent of LCL-expressed autosomal genes were significantly responsive to Chr X copy number, and 6% of such genes were responsive to Chr Y copy number (Figure 1).In fibroblasts, 5% of expressed autosomal genes were significantly X-responsive, and 2% were Y-responsive.Merging these observations, we find that, of 13,126 autosomal genes expressed in LCLs and/or fibroblasts, 2,722 (20.7%) are significantly responsive to Chr X and/or Chr Y copy number in one or both cell types (Table S13).Additionally, of 335 Xa-expressed genes in LCLs and/or fibroblasts, 88 (26.3%) are significantly responsive to Chr X and/or Chr Y copy number in one or both cell types (Table S13).As a group, these Xi-or Y-responsive genes-2,722 autosomal genes and 88 Xa-expressed genes-are widely dispersed across functional cate-gories, with only a weak enrichment for functional categories related to metabolism (Figure S22).Importantly, we observed these effects of Xi and Y copy number even when minimizing the differentiating influence of sex steroid hormones by culturing cells under identical conditions.Moreover, we found that the effects of Xi copy number on autosomal gene expression are correlated in cells derived from anatomically male and female individuals (Figures S3E  and S3F).We conclude that the Xi effects on autosomal gene expression observed here are independent of the presence of the Y chromosome or the gonadal sex of the individual donor.

Shared effects on the transcriptome reflect the common autosomal ancestry of X and Y chromosomes
The Y and X chromosomes play starkly different roles in the differentiation and function of the human reproductive tract. 45In cultured somatic cells, however, we observed that the Y and Xi chromosomes have broadly similar and highly correlated effects on the transcriptome of all other chromosomes, including the Xa chromosome found in somatic cells of all individuals, regardless of their biological sex.
We had not anticipated these similar effects of Xi and Y on the global transcriptome, but they can be explained in light of the evolution of the human X and Y chromosomes from a pair of autosomes over the past 200 million years. 1,2The human X and Y chromosomes originated not as ''sex chromosomes'' but as ordinary autosomes, identical in sequence and, in all likelihood, largely identical in their functions in male and female somatic cells.The genes of these ancestral autosomes included global mediators and regulators of gene expression unrelated to sex differentiation-mediators and regulators that survived the differentiation of the X and Y chromosomes and remain expressed today on both Y and Xi. 10,46ndeed, as shown here, one of these global regulators-surviving today as ZFX on the human X chromosome and its homolog ZFY on the human Y chromosome-accounts for about half of the shared effect of Y and Xi on global gene expression.Combining the results reported here with prior discoveries, we propose the following mechanistic model: The single Xa chromosome found in all somatic cells (regardless of sex) contributes a relatively invariant level of ZFX protein; each Xi chromosome in the cell adds smaller amount of ZFX protein, and each Y chromosome adds a similar quantity of ZFY protein.In trans, ZFX and ZFY proteins act additively, and function interchangeably, in modulating transcript levels for thousands of autosomal genes (and Xa genes).
The details and generalizability of this model should be tested in diverse somatic cell types.However, our findings are sufficient to state that the shared autosomal ancestry of the Xi and Y chromosomes continues to shape the functional roles of both chromosomes.Recognizing that ZFX (and ZFY) account for about half of the Xi and Y chromosomes' effects on the autosomal transcriptome in LCLs and skin fibroblasts, we encourage investigators to explore the possibility that other ancestral NPX-NPY gene pairs contribute mechanistically to sex-shared functions of the Xi and Y chromosomes.
The challenge: Distinguishing between the shared and differentiating roles of Xi and Y in human somatic cells Our findings demonstrate that human Xi and Y chromosomes perform shared roles in modulating the autosomal (and Xa) transcriptome in somatic cells.39,40 More research is required to determine the degree to which these transcriptomic differences represent sex differences in cellular composition, sex steroid hormones, or cell-autonomous roles of the Xi and Y chromosomes.The task of searching for and molecularly substantiating sex-differentiating, cell-autonomous roles of individual Xi-or Y-expressed genes in somatic tissues, at the molecular level, will be complicated by the breadth and scale of the sexshared, cell-autonomous roles described here.

Limitations of the study
We found a genome-wide response to Chr X and Y copy number in both LCLs and fibroblasts, but most of the autosomal genes that responded were X-and/or Y-responsive in only one of the two cell types.Thus, these autosomal ''responders'' will likely differ based on the epigenetic context of the cell or tissue type of interest, and this should be taken into consideration when comparing the gene lists here with metrics derived in other tissues.All experiments in this paper have been performed in cultured cells and should be validated in vivo.The genes directly targeted by androgen or estrogen receptors may differ from one cell type to another, so the absence of an overlap with X-and Y-responsive genes could reflect different targets in cancer cell lines versus LCLs and fibroblasts.The heterochromatin sink hypothesis was tested by studying naturally occurring variation in the Y chromosome; this hypothesis was not directly tested for the X chromosome, where we could not genetically or experimentally separate the possible effects of Xi heterochromatin from the effects of Xi gene expression.

Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, David C. Page (dcpage@wi.mit.edu).

Materials availability
Cell lines generated in this study are available upon request to the lead contact.
Data and code availability d Raw, de-identified RNA-sequencing data from human cell cultures has been deposited to dbGaP under accession number phs002481.v3.p1, and processed data has been deposited at Zenodo under accession number https://doi.org/10.5281/zenodo.10042162.

Human subjects
As part of an IRB-approved study at the NIH Clinical Center (12-HG-0181) and Whitehead Institute/MIT (Protocol # 1706013503), we recruited human subjects through the NIH Clinical Center, the Focus Foundation, and Mass General Hospital.We included individuals with a previous karyotype showing non-mosaic sex chromosome aneuploidy or a structural variant of the sex chromosomes, as well as euploid ''healthy volunteers''.We obtained informed consent from all study participants.For derivation of cell cultures and analysis, we collected blood samples and/or skin biopsies and shipped them to the Page lab.We performed karyotyping of peripheral blood and fibroblast cell cultures at the National Human Genome Research Institute Cytogenetics and Microscopy Core.We obtained additional cell lines from the Colorado Children's Hospital and Coriell Research Institute, and cultured them for at least two passages prior to RNA-sequencing.The analysis presented represents the combined analysis of samples published for the first time here, and a set of previously-published samples collected during the course of the same study. 15We provide information about all samples analyzed in this study in Table S1.

Cell culture
We derived and cultured lymphoblastoid cell lines (LCLs) and skin fibroblasts as previously described. 15Briefly, we collected blood in BD Vacutainer ACT blood collection tubes and placed two 4mm skin punch biopsies from the upper arm in transport media (DMEM/F12, 20% FBS, and 100 IU/mL Penicillin-Streptomycin [all ThermoFisher]) and shipped these to the Page lab 1-3 days after collection.We subjected blood to density gradient centrifugation to obtain lymphocytes, which we then transformed by culturing in complete RPMI (RPMI 1640 [Gibco], 25mM HEPES, 15% FBS [Hyclone], 1.25ug/ml Fungizone [Gibco], 3.33ug/ml Gentamycin [Gibco], 100 IU/mL Penicillin-Streptomycin [Lonza], pH 7.2) supplemented with EBV and 66.6 mg/ml cyclosporine until we observed cell clumping and proliferation.We minced skin biopsies and placed them on gelatin-coated plates with growth media (High Glucose DMEM [Gibco], 20% FBS [Hyclone], 2mM L-Glutamine [MP Biomedicals], 1X MEM Non-Essential Amino Acids [Gibco], 100 IU/mL Penicillin-Streptomycin [Lonza]), leaving undisturbed until fibroblasts grew out of the biopsies and could be passaged.For each sample, we collected and froze one million cells in RNAprotect Cell Reagent (Qiagen) at À80 C until RNA extraction.We confirmed that cell cultures were negative for mycoplasma contamination periodically using either the MycoAlert Kit (Lonza) or PCR, as described previously. 15A extraction and RNA-seq library preparation We extracted RNA using the RNeasy Protect Cell Mini Kit (Qiagen), per the manufacturer's instructions, using the following modifications: we added 10 mL b-mercaptoethanol per mL of buffer RLT for cell lysis, and 10 mL of a 1:100 dilution of ERCC RNA Spike-In Mix (Invitrogen) per million cells.We homogenized lysates using QIAshredder columns (Qiagen), removed genomic DNA using gDNA eliminator columns, performed all optional spin steps, and eluted RNA in 30 mL RNase-free water.We quantified RNA using a Qubit 2.0 or 4.0 fluorometer and the Qubit RNA HS Assay Kit (ThermoFisher).RNA was consistently high-quality with RNA integrity numbers (RIN) near 10 as measured on a Bioanalyzer 2100 instrument (Agilent).We prepared RNA sequencing libraries using the TruSeq RNA Library Preparation Kit v2 (Illumina) with modifications as detailed in Naqvi et al. 13 or using the KAPA mRNA HyperPrep Kit (Roche).In both cases, we size-selected libraries using the PippinHT system (Sage Science) and 2% agarose gels with a capture window of 300-600bp.We performed paired-end 100x100 bp sequencing on a HiSeq 2500 or NovaSeq 6000 (Illumina).We indicate the library preparation kit and sequencing platform for each sample in Table S1.

RNA-seq data processing
We performed all analyses using human genome build hg38, and a custom version of the comprehensive GENCODE v24 transcriptome annotation as in Godfrey et al. 47 This annotation represents the union of the ''GENCODE Basic'' annotation and transcripts recognized by the Consensus Coding Sequence project. 58To analyze samples in which ERCC spike-ins were added, we merged our custom transcript annotation with the ERCC Control annotation.
To pseudoalign reads to the transcriptome annotation and estimate expression levels of each transcript, we used kallisto software (v0.42.5). 52We included the -bias flag to correct for sequence bias.We imported the resulting count data (abundance.tsvfile) into R for analysis in DESeq2 (v1.26.0) 16 using the tximport package (v1.14.0). 59Although we pseudoaligned and performed DESeq2 on the entire custom transcriptome annotation, we restricted our downstream analysis to protein-coding and long non-coding RNA genes, as annotated in ensembl v107.We filtered the final analyses for genes with median expression level in XX or XY samples of at least 1 transcript per million (TPM) in a given cell type.
Linear modeling to identify autosomal X-or Y-responsive genes We performed linear modeling in DESeq2 with three covariates: Chr X copy number, Chr Y copy number, and library preparation batch.The ''log 2 fold change'' values output from DESeq2 in these analyses refer to the log 2 fold change in expression per copy of Chr X or Y.We considered genes with P adj < 0.05 as significantly Chr X-responsive or Chr Y-responsive.
To control for the possibility that bulk differences in sex chromosome gene expression may lead to global effects during normalization, we performed an identical analysis, removing Chr X and Y genes prior to DESeq2 normalization and linear modeling.In this analysis, we refit the same model on the renormalized data.This had little effect on the overall results.
We compared significantly X-responsive or Y-responsive genes between LCLs and fibroblasts by first restricting our analysis to autosomal genes expressed in both LCLs and fibroblasts, and then intersecting the significant gene lists.To compute p values of the overlaps, we used hypergeometric tests, with p < 0.05 considered significant.We calculated Pearson correlations between log 2 fold change per X or Y Chr in LCLs and fibroblasts on the overlapping sets of genes.

Saturation analysis of autosomal X-and Y-responsive genes
For LCLs and fibroblasts separately, we randomly sampled without replacement size-n subsets of available RNA-seq libraries.We ran DESeq2 on these subsets as in the full analysis, and recorded the number of significantly Chr X-and Y-responsive expressed autosomal genes (P adj < 0.05).We performed 100 down-samplings for each sample size, n.
Comparing X-or Y-responsive genes to estrogen or androgen receptor target genes We defined direct target genes of the estrogen and androgen receptors as genes that change in expression and are bound by estrogen receptor alpha (ERa) or androgen receptor (AR) within 30 kb of the transcription start site upon addition of 17-b estradiol (E2) or dihydrotestosterone (DHT), respectively.For ERa target genes, we reanalyzed RNA-seq and ChIP-seq data in MCF7 breast cancer cells from Guan et al., 17 and for AR target genes we reanalyzed RNA-seq and ChIP-seq data in LNCaP prostate cancer cells from Cato et al. 18 For RNA-seq data, we used the same processing pipeline as above-pseudoalignment and quantification of transcript abundance with kallisto and differential expression analysis (treatment versus vehicle) in DESeq2-with genes with P adj < 0.05 considered significantly responsive to E2 or DHT.
For AR and ER ChIP-seq data, we aligned the reads to the human genome (hg38) using bowtie2 (v2.3.4.1) 53 with one mismatch allowed (-N 1) and called peaks using the MACS2 (v2.2.7.1) 54 callpeak function with default parameters.We called peaks in hormone treated data and vehicle control, using the matched input samples as the background control.We then selected peaks exclusively found in the treated condition compared to the vehicle control.We associated peaks with genes in our custom GENCODE v24 transcriptome annotation using the bedtools (v2.26.0) closest function with default parameters.We considered genes with a peak within 30 kb of their transcription start site ''bound'' by a given factor.

Functional annotation
We performed functional gene category analysis on autosomal genes significantly responsive to Chr X or Y copy number using the Molecular Signatures Database (MSigDB) Gene Ontology Biological Process categories, version 2022.1.Hs. 48,49To test for enrichment of gene categories, we performed hypergeometric tests using all expressed autosomal genes in each cell type as the background set.We corrected the resulting p values for multiple hypothesis testing using the Benjamini-Hochberg method, with categories below the FDR threshold of 0.05 considered significantly enriched.

Chromosome 21 copy number analysis
To assess and compare the response to aneuploidy for an autosome with our sex chromosome aneuploidy model, we obtained six lymphoblastoid cell lines with Trisomy 21 from the Coriell Institute for Medical Research (AG09802, AG10316, AG10317, AG17485, AG13455, and AG16945; see Table S1 for more details).We cultured these LCLs in the laboratory for two passages and performed RNA-sequencing in parallel with the sex chromosome aneuploidy lines.We processed raw RNA-sequencing data as above.We used data from 46,XX, 46,XY, 47,XX+21, and 47,XY+21 to evaluate the effects of three versus two copies of Chr 21.We performed this analysis in DESeq2 using three covariates in a linear model: sex chromosome constitution (XX vs. XY), Chr 21 copy number (two vs. three), and library preparation batch.We previously analyzed effects of Chr 21 copy number on the expression of genes encoded on Chr 21, finding that $3/4 significantly increased in expression. 15Here, we compare the effects of Chr 21 copy number on the other autosomes to those of Chr X or Chr Y copy number.

Meta-analysis of published Xi expression datasets to define ''Xa-expressed'' genes
][21] The first dataset (Additional file 7 in Cotton et al.) 19 was derived from paired genomic and cDNA SNP-chips in skewed LCL and fibroblast cell cultures.We used the provided mean AR values for 416 genes informative in at least 5 samples, their standard deviations, and the number of informative samples, to test whether the AR values were significantly greater than zero with one-sample, one-sided t-tests.We corrected the resulting p values for multiple comparisons with the p.adjust function in R using the Benjamini-Hochberg method.Genes with adjusted p values (Padj) below 0.05 were considered expressed from Xi (''escape'').
The second dataset was derived from bulk or single-cell RNA-seq of LCLs (Tables S5 and S8 from Tukiainen et al.). 8The bulk RNAseq was from one individual in the GTEx dataset with 100% skewed X inactivation across the body, and the single-cell RNA-seq in LCLs was from three individuals (we excluded data from one dendritic cell sample).We included 82 genes with data from at least two individuals in the single-cell dataset, or one individual in the single-cell dataset and data from the bulk RNA-seq dataset.Using the ratio of read counts from the more lowly and highly expressed alleles in each sample, we calculated AR values and used the provided adjusted p values to identify genes with significant Xi expression (Padj <0.05).We called a gene as ''escape'' if one or both of the datasets showed evidence of Xi expression.
The third dataset (Dataset 3 from Garieri et al.) 20 was derived from single-cell allelic expression in fibroblasts. 20We included 203 genes that had data from at least two of the five samples in the dataset.We converted the reported values (Xa reads/total reads) to AR values using the following formula: AR = 1 Xa reads total reads À 1.We used the provided AR threshold to consider a gene significantly expressed from Xi in each sample (AR>0.0526).We called a gene as ''escape'' if it had at least one sample with significant expression from Xi.
The fourth dataset (Tables S4 and S5 from Sauteraud et al.) 21 was derived from allele-specific bulk RNA-seq performed on 136 LCLs with skewed X inactivation.We analyzed 215 genes that had data from at least 10 samples.We calculated an AR value for each gene in each sample using the read counts from the more lowly and highly expressed alleles in each sample, adjusting for the level of skewing in each sample.To identify genes that were significantly expressed from Xi across samples, we performed paired, two-sample, one-sided t-tests, asking whether the raw (pre-adjusted for skewing) AR values were greater than the baseline AR given the level of skewing in each sample (baseline AR = 1 À skewing coefficient skewing coefficient ); we corrected the resulting p values for multiple comparisons with the p.adjust function in R using the Benjamini-Hochberg method.We considered genes with Padj <0.01 as significantly expressed from Xi (''escape'').
The fifth dataset (Table S6 from San Roman et al.) 15 was derived from our prior analysis of allelic ratios in bulk RNA-sequencing of skewed LCLs and fibroblasts with two copies of the X chromosome.We calculated AR values for 152 genes in LCLs and 120 in fibroblasts.We used the Benjamini-Hochberg adjusted p values from one-sided, one-sample t-tests to determine whether AR values were significantly greater than zero, considering genes with Padj <0.05 as expressed from Xi (''escape'').
Next, we synthesized the calls from these five datasets.We assigned a gene as having evidence for expression from Xi (''escape'') if 1) most (>50% of) studies indicated ''escape'' or 2) 50% or fewer (but more than 0) studies indicated ''escape'' and either i) there was more than one study with evidence of escape or ii) the average AR across all studies was R0.1.We assigned genes as not expressed from Xi (''subject'' to X chromosome inactivation) if: 1) all studies indicated ''subject'' or 2) most (>50%) studies indicated ''subject'' and the average AR across all studies was <0.1.For genes with no data from any of the five studies, we investigated whether there was evidence of expression from Xi using human-rodent hybrid cell lines carrying a human Xi from Carrel et al. 7 as compiled in Balaton et al. 22 If a gene was expressed in at least 22% of Xi hybrid cell lines (per Carrel et al.), we considered the gene to be expressed from Xi (''escape'').Our final annotations of ''Xa-expressed'' genes are listed in Table S4.In total, we found 107 genes with prior evidence of expression from Xi (''escape''), 460 genes without prior evidence of expression from Xi (''subject''-considered here as ''Xaexpressed''), and 398 genes without data to make a call (''no call'').

Analysis of Yq heterochromatin length
To assess Yq heterochromatin length, we analyzed whole genome sequencing data of 1,225 males from the 1000 Genomes Project. 26We aligned FASTQ files with paired-end reads to the human reference genome (hg38) using bowtie2. 53From each male in the sample, we calculated the average depth of coverage of the DYZ1 region.In addition, we calculated the average depth of coverage of a 1-megabase single-copy region of the Y chromosome (from bases 14,500,000-15,500,000).We adjusted read depths to account for GC bias in sequencing depth. 60,61In cases of individuals with multiple sequenced samples, we summed the read depths following GC-bias correction.For each male, we normalized mean DYZ1 read depth by mean single-copy region read depth to account for differences in sequencing depth between different sequencing runs.After these steps, a higher depth of coverage implies higher DYZ1 copy number and an increase in the size of the Yq heterochromatic region.
To analyze gene expression as a function of DYZ1 read depth, we obtained RNA-seq data from The Geuvadis Consortium and restricted our analysis to samples marked as male, ''UseThisDuplicate = = 1'', RNA integrity number (RIN) R 8, and with estimates for DYZ1 read depth.We excluded the following outlier or incorrectly annotated samples as flagged in t'Hoen et al. 62 : NA12546.1,HG00329.5, NA18861.4,NA19144.4,NA19225.6,HG00237.4,NA12399.7, and NA07000.1, resulting in a total of 194 male samples.We downloaded the raw RNA-seq data and pseudoaligned the reads with kallisto as above.Using DESeq2, we modeled the log 2 read counts as a function of log 2 DYZ1 depth of coverage, controlling for population and the lab that sequenced the libraries.

Analysis of gene expression in individuals with structural variants of the sex chromosomes
Naturally occurring structural variants of the sex chromosomes allowed us to investigate the consequences of deleting or duplicating specific genes on Chr X or Chr Y.The Page lab collected these samples over several decades during the course of studying infertility, disorders of sex development, and sex chromosome aneuploidy.We cultured these cells and performed RNA-sequencing in parallel with the sex chromosome aneuploidy cell lines.

Y isochromosomes
The Page lab previously identified three unrelated individuals with Y isochromosomes lacking the Yq heterochromatic region among males with spermatogenic failure. 63,64One of these isochromosomes resulted from recombination between the IR4 repeats on Yp and Yq (WHT5557, referred to as a ''pseudoisochromosome'') and two resulted from recombination more distally, within the P1 palindrome.Both recombination events resulted in the duplication of protein-coding NPY genes and all PAR1 genes expressed in somatic cell types.For the 442 X-and Y-responsive autosomal genes in LCLs, we used PCA plots to visualize the clustering of the Y isochromosome samples within the Chr Y chromosome copy number series, controlling for a single X chromosome (0Y: 45,X; 1Y: 46,XY; 2Y: 47,XYY; 4Y: 49,XYYYY).For the PCA plots, we performed a variance stabilizing transformation on normalized counts from DESeq2 and used the removeBatchEffect function in the limma R package (v3.42.2) 65 to remove any effects of library preparation batch.Ellipses representing the 95% confidence intervals around the centroid of each group (with at least 3 samples) were computed using ClustVis software. 57

X isochromosomes
Through karyotyping of individuals with Turner syndrome, we identified four unrelated individuals with X isochromosomes in which the entirety of Xq-and in some cases proximal portions of Xp-is duplicated.We mapped the breakpoints of these isochromosomes by fluorescent in situ hybridization (FISH) using bacterial artificial chromosomes (BACs) with known locations across the X chromosome as DNA probes (Figure S11; see below for FISH methods).In two individuals, WHT7183 and WHT7399, the breakpoints mapped to the centromere.In these two cases, probe 343B12, proximal to all genes on Xp, was absent on the isochromosome, and probe 168B01, proximal to all genes on Xq, was present in two copies.A third individual, WHT7182, had a breakpoint on Xp $5 Mb from the centromere.This breakpoint mapped within the 206-kb region (chrX:52,607,059-52,813,686) covered by FISH probe 34G18.This region contains a palindrome (P5), the likely site of homologous recombination.A fourth individual, WHT7207, had a breakpoint on distal Xp near PAR1.We mapped the breakpoint to a 361-kb region (chrX:9,438,772-9,842,745) from TBL1X to SHROOM2.TBL1X is expressed at a level consistent with it being present on this individual's isochromosome, indicating that the breakpoint is proximal to TBL1X.
In analyzing gene expression from the X isochromosome samples, we discovered that the isochromosome was not always stable in the cell cultures.Samples with recombination in the centromere had stable karyotypes in both LCLs and fibroblasts.However, for the two samples with more distal Xp breakpoints, LCLs mostly or completely lost the isochromosome and thus could not be used in this analysis.Fibroblasts retained the isochromosome in enough cells that we could analyze their gene expression (WHT7182: 47% 46,X,i(Xq) and 53% 45,X, 49 interphase nuclei; WHT7207: 64% 46,X,i(Xq) and 36% 45,X, 100 interphase nuclei), however the resulting expression levels should be considered an average between these populations.For the X-and Y-responsive autosomal genes in LCLs or fibroblasts, we used PCA plots to visualize the clustering of the X isochromosome samples with the Chr X copy number series, without Y chromosomes (1X: 45,X; 2X: 46,XX; 3X: 47,XXX; 4X: 48,XXXX).

X-Y translocations
Among individuals with 46,XY disorders of sexual development or Turner syndrome, we identified three females (WHT1297, WHT1489, WHT1615) with X-Y translocations between the PRKX and PRKY genes [46,X,t(X; Y)(PRKX-PRKY)].Recombination between PRKX and PRKY is common among 46,XY females and is facilitated by a polymorphic Yp inversion. 25,66We used a PCR-based screening approach to determine and sequence the breakpoints (Table S7).These females are missing five NPX-NPY genes: SRY, RPS4Y1, ZFY, AMELY, and TBL1Y.Samples WHT1297 and WHT1489 are missing PRKX but retain PRKY, while sample WHT1615 is missing PRKY but retains PRKX.
Also among individuals with 46,XY disorders of sexual development, we ascertained four females (WHT1785, WHT1798, WHT1799, WHT2315) with X-Y translocations between the ANOS1 (KAL1) and ANOS2P (KALP) genes [46,X,t(X; Y)(ANOS1-ANOS2P)].We identified breakpoints using a PCR-based screening approach (Table S7).In three of the cases (WHT1785, WHT1798, and WHT1799, ANOS1 is retained, while ANOS2P is deleted.In WHT2315, the recombination is within ANOS1 and ANOS2P, resulting in the partial deletion of each.Based on RNA-seq data of the translocated samples, some or all of the of the genes on Yq are not expressed.This is likely due to spreading of X chromosome inactivation onto the translocated portion of the Y.In WHT1799, no Y genes are expressed; in WHT1785 and WHT1798 only EIF1AY is expressed; while in WHT2315 TXLNGY and KDM5D are expressed at low levels and EIF1AY is expressed (Table S7).

Fluorescent in situ hybridization
To assess copy number and variations in X and Y chromosome structure, we performed DNA fluorescent in situ hybridization using BACs with known locations as DNA probes, as previously described. 63We labeled FISH probes by nick translation with biotin-dUTP (Sigma) and Cy3-dUTP (Amersham).To detect Biotin-labelled probes, we used Avidin-FITC probes (Life Technologies, 1:250 dilution), and counterstained DNA with DAPI.We captured images on a DeltaVision deconvolution microscope.We provide information on probes for FISH experiments in Table S7.For copy number analysis, we performed FISH on interphase cells using Whole X and Whole Y probes (Applied Spectral Imaging).

Motif analysis
We analyzed DNA motif enrichment using Homer software v4.11.1. 56The findMotifs function identified de novo motifs enriched in promoters of a gene set of interest (e.g., Chr X-responsive autosomal genes in LCLs) relative to expressed genes as a ''background'' set (e.g., autosomal genes expressed in LCLs).We defined promoters as +/À 1 kb from the transcription start site.We used the ''-fdr'' argument with 100 randomizations to calculate an empirical FDR for the motif enrichments.We then compared the enriched motifs to motif databases to identify the best match among known DNA-binding proteins.

Analysis of ZFX ChIP-seq data
To detect ZFX binding at gene promoters, we analyzed four ZFX ChIP-seq experiments in different cell lines from ENCODE4 v1.6.1 (HCT 116, C4-2B, MCF-7, and HEK-293T).Using the BEDTools closest function, we mapped ZFX peaks to the nearest annotated transcription start site in the GENCODE v24 comprehensive annotation. 55For gene expression analysis in each cell line, we processed fastq files from ENCODE or GEO 31,32 using kallisto as above or, when available, we downloaded pre-processed RNA-seq data.We restricted our analysis of ZFX peaks in each cell line to genes with median TPMR1.
To evaluate ZFX binding in promoters of genes, we investigated whether there was a ChIP-seq peak 1 kb up-or downstream of the transcription start site.To get a sense for the most robust binding sites across cell types, we calculated the proportion of cell lines bound at the promoter of each gene by dividing the number of cell lines with binding by the number of cell lines in which that gene is expressed.

Identification of ZFX direct target genes from published data
To determine the direct targets of ZFX, we combined ZFX ChIP-seq results (analyzed as described above) and gene expression analysis of ZFX knockdown or knockout from Rhie et al. and Ni et al. in three cell lines: MCF7, C4-2B, and HEK293T. 31,32Direct target genes in each cell line were defined as genes where ZFX bound within 1 kb of the transcription start site and that significantly change in expression with ZFX knockdown or knockout.We obtained the raw RNA-sequencing data from GEO and processed the RNA-seq data as above: pseudoaligning and quantifying transcript abundances with kallisto and performing differential expression analysis (ZFX knockdown or knockout versus control) using DESeq2.We considered genes with Padj<0.05 to be significantly ZFXresponsive.

CRISPR interference
To knock down endogenous ZFX or ZFY levels in 46,XX and 46,XY fibroblasts, we used CRISPR interference to target a nucleasedead Cas9 fused with a repressive KRAB domain (dCas9-KRAB) to the ZFX or ZFY promoter and thereby repress ZFX or ZFY transcription.
To clone guides into the sgOpti vector, we digested using FastDigest BsmBI (ThermoFisher) and dephosphorylated the ends with FastAP (ThermoFisher) for 30 min at 37 C. We gel-purified the digested plasmid using the QIAquick Gel Extraction Kit (Qiagen).Prior to ligation, we phosphorylated and annealed each pair of oligos using T4 Polynucleotide kinase (New England BioLabs).We then ligated each insert into the backbone using Quick Ligase (New England BioLabs) for 10 min at room temperature, and transformed into NEB Stable Cells for amplification (New England BioLabs).We confirmed gRNA sequences by sequencing.
LentiX-293T cells (Takara) were cultured in DMEM with 2mM L-Glutamine, 100U/mL Penicillin-Streptomycin, and 10% FBS on plates coated with collagen (5 mg/cm 2 ) to produce virus.For gRNA constructs, we plated 5x10 6 LentiX-293T cells in one 10 cm plate the night before transfection.The next day, we co-transfected 4.5 mg of sgOpti, 4 mg of pCMV-VSV-G, and 6.5 mg of psPAX2 using TransIT-LT1 reagent (Mirus).We harvested virus-containing media once at 48 h and once at 72 h, pooled, and tested for successful viral production using Lenti-X GoStiX (Takara).We concentrated gRNA virus 10X using Lenti-X concentrator (Takara).For the dCas9-KRAB, we co-transfected five plates of LentiX-293T cells with 7 mg of pLX_311, 3 mg of pCMV-VSV-G, and 5 mg of psPAX2.We added Viral Boost reagent (Takara) 24 h post transfection, and again with replacement media after harvesting at 48 h.We pooled virus-containing media across all plates and concentrated 100X.

Generation of dCas9-KRAB fibroblast cultures
For CRISPRi we transduced 46,XX and 46,XY fibroblasts from three individuals each.We performed transductions using the ''spinfection'' method: we added 1 mL fibroblast media with 5-20 mL of 100X concentrated pLX_311 lentiviral vector and 8 mg/mL polybrene (hexamethrine bromide; Sigma) to 7.5x10 4 cells per well in 12-well plates or 1.875x10 5 cells per well in 6-well plates, spun at 1000 rpm for 1 h and incubated overnight.24 h post infection, we trypsinized each and expanded to 6-well or 10-cm plates with fibroblast media containing 5 mg/mL blasticidin (Life Technologies), which we maintained throughout the experiment for continuous selection.

ZFX and ZFY knockdown and analysis
We transduced control (intragenic) and ZFX-or ZFY-targeting gRNAs into the stably-expressing dCas9-KRAB fibroblasts as above, and selected cells using 2 mg/mL puromycin (Sigma) beginning 24 h post infection.We collected cells 72 h post infection.We isolated RNA and prepared libraries for RNA-seq as described above and sequenced using 150 bp paired-end reads on a NovaSeq 6000.We trimmed reads to 100 bp, pseudoaligned with kallisto, and imported the count data into R using tximport, as described above.We used DESeq2 to perform differential expression analysis including sgRNA target (ZFX, ZFY, or intergenic control) and cell line as covariates.

QUANTIFICATION AND STATISTICAL ANALYSIS
We used various statistical tests to calculate p values as indicated in the STAR Methods section, figure legend, or text, where appropriate.To calculate all statistics and generate plots, we used R software, version 3.6.3. 70We considered individual results statistically significant when p < 0.05 or, when using multiple hypothesis correction, P adj < 0.05 or FDR<0.05.In cases where conclusions were based on multiple statistical analyses, we corrected the p values using the Bonferroni method.The minimum reportable number in R software is 2.2eÀ308, so p values close to zero were reported as less than this value.Unless otherwise stated, we plotted data summaries as median and interquartile range.We used Deming regressions to compare variables that are measured with error (e.g., log 2 fold change per chrX vs. log 2 fold change per chrY).We calculated Deming regressions using the R package ''deming'' v1.4.For weighted Deming regressions, we included error values using the ''xstd'' and ''ystd'' arguments.
Chr Y 21% of expressed autosomal genes change with Xi or Y chromosome copy number mediate half of shared genome-wide response to Xi and Y chromosome copy number (B and E) Examples of individual autosomal genes that significantly respond to Chr X (B) or Chr Y (E) copy number in LCLs.Each point represents the expression level in an individual sample with the indicated number of X or Y chromosomes.The regression lines and confidence intervals, log 2 fold change per copy of Chr X or Y, and adjusted p values (P adj ) from linear regressions are indicated.(C, D, F, G) Log 2 fold change per copy of Chr X or Y for 11,034 expressed autosomal genes in LCLs and 12,002 expressed autosomal genes in fibroblasts, plotted by chromosomal location.Genes in gray do not significantly change in response to X or Y copy number, while genes in dark or light colors (odd-or evennumbered chromosomes, respectively) represent significantly Chr X-(C, D) or Chr Y-responsive (F, G) genes.(H) Venn diagrams showing the overlap of genes that significantly respond to X or Y copy number (P adj < 0.05) in LCLs and fibroblasts.Genes expressed in both cell types were included in the Venn diagrams, and genes with cell-type-specific expression are noted below.Bonferroni-adjusted p values, hypergeometric test.(I) Scatterplots of genes that are Chr X-or Y-responsive in both LCLs and fibroblasts show mostly similar log 2 fold change (log 2 FC) values across cell types.Black line and gray shading, weighted Deming regression and 95% confidence interval; blue dashed line, identity (X = Y) line.Pearson correlation coefficients and Bonferroni-adjusted p values are indicated.

Figure 2 .
Figure 2. Chr X and Y copy number have similar genome-wide effects (A and D) Venn diagrams of significantly (P adj < 0.05) X-and Y-responsive autosomal (A) or Xa-expressed (D) genes reveal significant overlap.Bonferroni-adjusted p values, hypergeometric test.(B and E) Scatterplots of log 2 fold change (log 2 FC) per Chr X vs. Chr Y for significantly Chr X-and Y-responsive autosomal (B) or Xa-expressed (E) genes show correlated responses.(C) Scatterplots of all expressed autosomal genes in LCLs and fibroblasts show globally correlated response to Chr X and Chr Y copy number.In (B), (C), and (E): black line and gray shading depict weighted Deming regression and 95% confidence interval; blue dashed line, identity (X = Y) line; Pearson correlation coefficients and Bonferroni-adjusted p values are indicated.

Figure 3 .
Figure 3.The shared response to Chr X and Y copy number is not a general response to aneuploidy (A) RNA-seq data from LCLs with two or three copies of Chr 21 were analyzed using linear regression.(B and C) Scatterplots and regression lines with confidence intervals of individual autosomal genes showing expression changes in samples with two or three copies of Chr 21.The log 2 fold change per Chr 21 and P adj from linear regressions are indicated.

(
D) The log 2 fold changes per Chr 21 across all chromosomes excluding (excl.)Chr 21 are plotted by chromosomal location.Genes in gray are expressed, but do not significantly change in response to Chr 21 copy number, while genes in dark or light pink (colored by every other chromosome for clarity) represent significantly Chr 21-responsive genes.(E and F) Scatterplot of all expressed autosomal genes, excluding Chr 21, comparing Chr 21 response to Chr X (E) or Chr Y (F) response.Blue dashed line, Chr 21 = Chr X (E) or Chr Y (F).(G) Violin plot with median and interquartile range of the absolute values of the log 2 fold changes for significantly Chr 21, X, or Y-responsive autosomal genes excluding Chr 21.Numbers of genes included in the plot: 1,030 for Chr 21, 1,968 for Chr X, and 653 for Chr Y. Bonferroni-adjusted p values, Wilcoxon rank-sum test.(H) Venn diagram of Chr X-and Y-and Chr 21-responsive genes shows little overlap (p values, hypergeometric test).(I) Scatterplot of autosomal genes (excluding Chr 21) regulated by Chr X and Y shows little correlation with responses to Chr 21.All scatterplots: black line and gray shading, Deming regression and 95% confidence interval; blue dashed line, identity (X = Y) line; Pearson correlation coefficients are indicated.

Figure 4 .
Figure 4. Heterochromatin sink model does not explain genes upregulated in response to Chr Y copy number (A) Histogram of DYZ1 depth of coverage for the 194 46,XY samples from the 1000 Genomes Project that have paired RNA-sequencing data.(B) Each point represents the magnitude and significance of expression change in response to DYZ1 depth of coverage for expressed autosomal genes in LCLs.Only three genes are below the significance cutoff of P adj = 0.05.(C) Schematic of the normal Y chromosome and two types of variant Y chromosomes that have recombined in repeated DNA regions to generate chromosomes missing the large heterochromatic region on the long arm and have two copies of many or all Chr Y genes expressed in somatic cells (purple horizontal lines).One Y pseudoisochromosome (piYp) resulted from recombination in an inverted repeat (IR4) on Yp and Yq, and a different Y isochromosome (iYp) resulted from recombination in a palindrome (P1) on Yq. (D) Principal-component analysis (PCA) plot showing separation of samples based on expression of the 442 autosomal genes that are X-and Y-responsive in LCLs.Ellipses represent 95% confidence intervals around the centroid of each karyotype group with at least three samples.The 46,X,i(Yp) and 46,X,pi(Yp) samples cluster away from 45,X samples and near those with two Y chromosomes (47,XYY).

Figure 5 .
Figure 5. NPX-NPY pair genes on Xp (the short arm), but not PAR1 genes, drive the shared genome-wide response to X and Y chromosome copy number (A) Anatomy of the sex chromosomes.Locations of FISH probes used in (B) are indicated.(B) DNA FISH on four 46,X,i(Xq) samples refined the sites of recombination.Fluorescent images of the normal X and iXq from the same cell are shown side-byside, with probes across the chromosome (green) and the X centromere (red).Full FISH results are found in Figure S11.(C) Schematic of a normal Chr X and three types of X isochromosomes.Locations of NPX-NPY pair genes indicated by dark orange lines.The sample IDs for individuals carrying each type of X isochromosome are provided.We obtained matching LCLs and fibroblasts for one WHT7183, while we only had LCLs from WHT7399 and only fibroblasts for WHT7182 and WHT7207.(legend continued on next page) Cell Genomics 4, 100462, January 10, 2024 9

Figure 6 .
Figure 6.ZFX and ZFY activate a common set of genes (A) Schematic of CRISPRi knockdown experiments.46,XX and 46,XY fibroblasts from three individuals were transduced with dCas9-KRAB and sgRNAs directed against a control intergenic region on Chr 2, the promoter of ZFX, or the promoter of ZFY to block transcription.(B) Violin plots with median and interquartile range showing relative expression of ZFX or ZFY in cells transduced with ZFX and/or ZFY sgRNAs, compared with control sgRNAs.Log 2 FC and P adj values from DESeq2 results.(C-F) Volcano plots showing effect size and significance of all expressed genes in ZFX and/or ZFY knockdowns vs. controls.Numbers of significantly differentially expressed genes are indicated.Note that the Y axis limits are not the same across these plots.(G) Stacked bar plots showing median and interquartile ranges (whiskers) of cumulative expression in transcripts per million (TPM) of ZFX and ZFY in CRISPRi experiments.Bonferroni-adjusted p values, Wilcoxon rank-sum test.(H) Venn diagram of significantly differentially expressed autosomal genes upon knockdown of ZFX in XX or XY cells or upon knockdown of ZFY in XY cells.Bonferroni-adjusted p values, hypergeometric test.(I) Scatterplot of significantly ZFX-and ZFY-responsive autosomal genes in 46,XY cells comparing effects of ZFX vs. ZFY knockdowns.Black line and gray shading, weighted Deming regression and 95% confidence interval; blue dashed line, identity (X = Y) line; Pearson correlation and p value are indicated.

Figure 7 .
Figure 7. ZFX and ZFY are required for shared Chr X and Chr Y-responsive transcriptional program (A and C) Venn diagram of X-and Y-responsive autosomal (A) or Xa-expressed (C) genes in fibroblasts with union of all ZFX-and ZFY-responsive genes across the four CRISPRi experiments.Bonferroni-adjusted p values, hypergeometric test.(B and D) Scatterplots comparing response to Chr X copy number and response to ZFX or ZFY knockdown for 136 autosomal genes (B) or seven Xa-expressed genes (D) that are significantly X-and Y-responsive in fibroblasts.Black line and gray shading, weighted Deming regression and 95% confidence interval; blue dashed line, identity (X = Y) line; coefficients of determination and Bonferroni-adjusted p values are indicated.
This paper analyzes existing, publicly available data.Accession numbers for these datasets are listed in the key resources table.d Original code has been deposited at Zenodo under accession number https://doi.org/10.5281/zenodo.10042162and is publicly available as of the date of publication.d Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request. d