Concentration dependent chromatin states induced by the bicoid morphogen gradient

In Drosophila, graded expression of the maternal transcription factor Bicoid (Bcd) provides positional information to activate target genes at different positions along the anterior-posterior axis. We have measured the genome-wide binding profile of Bcd using ChIP-seq in embryos expressing single, uniform levels of Bcd protein, and grouped Bcd-bound targets into four classes based on occupancy at different concentrations. By measuring the biochemical affinity of target enhancers in these classes in vitro and genome-wide chromatin accessibility by ATAC-seq, we found that the occupancy of target sequences by Bcd is not primarily determined by Bcd binding sites, but by chromatin context. Bcd drives an open chromatin state at a subset of its targets. Our data support a model where Bcd influences chromatin structure to gain access to concentration-sensitive targets at high concentrations, while concentration-insensitive targets are found in more accessible chromatin and are bound at low concentrations. This may be a common property of developmental transcription factors that must gain early access to their target enhancers while the chromatin state of the genome is being remodeled during large-scale transitions in the gene regulatory landscape.


INTRODUCTION 21
During embryonic development, multicellular organisms must generate the 22 patterned tissues of an adult organism from a single undifferentiated cell. This process 23 2 requires highly regulated control of gene expression, both in developmental time and at 24 reproducible positions in an embryo. These complex gene regulatory networks are 25 controlled by systems of transcription factors, which bind to DNA and control the 26 expression of genes required for development (Levine and Davidson, 2005). In early 27 Drosophila melanogaster embryos, the Bicoid transcription factor forms an anterior-to-28 posterior protein gradient the embryo (Driever and Nüsslein-Volhard, 1988b). Bcd 29 functions as transcriptional activator to pattern the embryo, binding to target gene 30 enhancers and activating gene expression at distinct positions along the AP axis, 31 corresponding to different concentrations of Bcd protein 32 1988a;; Struhl et al., 1989). This difference likely reflects the larger sample size used in our study, as well as our 255 method for classifying Bcd bound peaks. The Concentration-Sensitive III class did not 256 contain an enrichment of any site over the total peak set. The Concentration-Insensitive 257 class, however, showed a higher prevalence of the Zld binding site relative to the total 258 peak set than any other class. 259 These results indicate that, in contrast to a binding site affinity model for Bcd suggests that in vivo chromatin structure also plays a role in the sensitivity of a given 275 target to transcription factor concentration in the context of the developing embryo. 276 Taken together, these findings suggest that the chromatin context of an enhancer may 277 play a greater role in its overall affinity for a transcription factor in vivo than the 278 sequences of the binding sites that it contains. We therefore set out to test the 279 hypothesis that sensitivity classes are distinguished at the level of chromatin structure.  Table  S5) 294 show reduced accessibility in zld mutants, indicating that Bcd bound regions are more 295 likely to be dependent on Zld for their accessibility than the genome as a whole. 296 However, the Zld-dependent peaks are distributed across each sensitivity class 297 determined by ChIP, with no particular class being significantly more  This contrasts with the distribution of binding sites in the peak classes, which revealed 299 that the Concentration-Insensitive peaks were more likely to contain Zld binding sites. 300 These results suggest that while Zld contributes to the accessibility of a subset of Bcd 301 target gene enhancers, it is unlikely to determine the differential concentration sensitivity 302 of Bcd peaks as a whole. 303 Given the enrichment for both strong and weak Bcd binding sites in the 304 Concentration-Sensitive I and II classes, we next examined the impact of Bcd protein 305 itself on chromatin accessibility by ATAC seq ( Figure  3A). In bcd mutants, 326 (2.4%) of 306 the 13,266 open regions in wild-type embryos show significantly reduced accessibility 307 accompanied by increased nucleosome occupancy in those same regions ( Figure  3A  308 and B). These regions are therefore either directly or indirectly dependent on Bcd for 309 their accessibility. More strikingly, 132 (12.9%) of the 1,027 Bcd ChIP-seq peaks show 310 reduced accessibility in the absence of Bcd and likely represent regions where Bcd's 311 impact is direct. These regions dependent on Bcd for accessibility are significantly 312 enriched for peaks in the Concentration-Sensitive I and II classes (32.9% and 31.9% of 313 each class, with Fisher's exact test P-values of 4.29 x10 -12 and 1.37x10 -10 , respectively). 314 In contrast, the Concentration-Sensitive II and Concentration-Insensitive classes are 315 both significantly underrepresented (6.07% and 0.7% and P-values = 7.88 x10 -13 and 316 2.29 x10 -8 ) ( Figure  3B). This suggests that Bcd binding influences chromatin 317 accessibility preferentially at a subset of highly concentration-sensitive enhancers. 318   15    Because  the  Concentration-Sensitive  I  and  II  classes  are  bound  primarily  at  high  319 Bcd concentrations, Bcd's effects on chromatin accessibility at these targets likely 320 occurs only in anterior regions of the embryo. In support of this, we find that chromatin 321 accessibility at Bcd-dependent, concentration-sensitive targets is responsive to Bcd 322 concentration. Expressing uniform Bcd confers accessibility to peaks that are not 323 accessible in bcd mutant embryos ( Figure  3B). The degree of chromatin accessibility 324 conferred by Bcd correlates positively with the concentration of uniform Bcd expressed 325 ( Figure  3B). This observation, along with the overrepresentation of the Concentration-326 Sensitive I and II classes in the Bcd-dependent peaks, suggests that Bcd influences the 327 chromatin state of these targets primarily at the high concentrations found in the anterior 328 of the embryo.

329
A second feature that distinguishes the chromatin structure of Bcd binding sites 330 is the presence of DNA sequences favorable for nucleosome occupancy (Segal et al., 331 2006). Bcd bound regions in wild type embryos are generally depleted of nucleosomes 332 ( Figure  4A). However, predicting nucleosome positioning sequences using the NuPoP 333 algorithm (Xi et al., 2010) suggests that the Concentration-Sensitive I and II Bcd 334 enhancer classes are more likely to bind nucleosomes than the Concentration-Sensitive 335 III and Concentration-Insensitive classes. ( Figure  4B). The contrast between predicted 336 occupancy and observed depletion suggests that these regions are actively restructured 337 for Bcd and other transcription factors to bind. The increased nucleosome preference of 338 the more Concentration-Sensitive peaks, combined with the observation that these sites 339 become occupied by nucleosomes in bcd mutants suggests a model where Bcd, either 340 directly or in combination with cofactors is able to direct chromatin remodeling events, 341 16 which may play a significant role in distinguishing concentration-sensitive and -342 insensitive targets. Additionally, we find that Bcd-bound regions that are dependent on 343 either Zld or Bcd for their accessibility are more likely to have a higher nucleosome 344 preference than regions that are independent of both factors ( Figure  4C). This further 345 suggests that Bcd is able to overcome a high nucleosome barrier in a manner similar to 346 Zld (Sun et al., 2015) at a subset of its target enhancers. 347 These effects of chromatin accessibility impact the availability of sequence motifs 348 for binding Bcd. In wild type embryos, there is a gradual increase in average motif 349 accessibility from high to low sensitivity, and this difference becomes more pronounced 350 in bcd mutant embryos ( Figure  4D), consistent with a role for Bcd in driving changes in 351 accessibility at more sensitive sites in a concentration dependent manner. This is also 352 evident at the level of nucleosome organization. Calculating the fraction of motifs that 353 overlap nucleosomes in either wild type or bcd mutant chromatin conformations, we find 354 that whereas on average across all sensitivity classes 55 ± 2% of Bcd motifs are in 355 nucleosome-free tracts in wild-type embryos, in bcd mutant embryos motifs have lower 356 overall accessibility and a graded association with nucleosomes that correlates with the 357 sensitivity classes (41%, 46%, 50%, and 53% of motifs are accessible from high to low 358 sensitivity). These results indicate that the mechanistic determinants of concentration-359 dependent Bcd action likely involve a complex interaction between Bcd, DNA, and 360 chromatin structure. 361 A truncated Bcd protein shows reduced binding specifically at concentration-362 sensitive target enhancers 363 described. We hypothesize that Bcd renders its target sites accessible either by 365 competing with nucleosomes to access its binding sites and bind to DNA at high 366 concentrations or by recruiting chromatin-remodeling enzymes to accessible motifs and 367 subsequently driving local nucleosome remodeling to render more sites accessible. We 368 reasoned that if Bcd can displace nucleosomes simply by competing with them for 369 access to its binding sites, it should be possible for the Bcd DNA-binding homeodomain 370 to compete. However, if Bcd instead drives remodeling via recruitment of cofactors, it is 371 likely that these interactions or activities are carried out through regions of the protein 372 outside of the DNA binding domain. To distinguish between these two possibilities, we 373 designed a transgenic GFP-Bcd construct that is truncated downstream of the 374 homeodomain. We modeled the truncated Bcd protein after the bcd 085 allele, which was 375 originally classified as an "intermediate allele" of bcd (Frohnhöfer and Nüsslein-Volhard, 376 1986) and reported to have weak transcriptional activating activity (Struhl et al., 1989). 377 The truncation occurs 28 amino acids downstream of the homeodomain ( Figure  5A), 378 and the GFP-tagged protein was therefore expected to bind DNA but lack functions 379 requiring its C-terminus. The truncated protein (known as GFP-Bcd 085 ) forms a gradient 380 from the anterior of the embryo, and is expressed at a similar level as a full-length GFP-381 Bcd ( Figure  5B). 382 By ChIP-seq, we found that compared to wild-type, GFP-Bcd 085 binding in the 383 Concentration-Sensitive I and II enhancer classes was significantly reduced relative to 384 the Concentration-Sensitive II and Concentraiton-Insensitive classes (p-value < 0.0001 385 in permutation test with n = 10,000 trials) ( Figure  5C). Our ATAC-seq experiments 386 revealed that these classes have reduced chromatin accessibility in bcd mutants 387 ( Figures  3B  and  4D). Taken together, these results suggest that Bcd's ability to access 388 its concentration-sensitive targets is dependent upon activities carried out by domains in 389 the C-terminus of the protein, likely via recruitment of a cofactor, and that its DNA 390 binding activity alone is insufficient to drive chromatin accessibility. The results presented here demonstrate that the positional information in the Bcd 418 gradient is read out as differential binding between Bcd and the cis-regulatory regions of 419 its target genes. The overrepresentation of enhancers for anteriorly expressed target In this way, expression of these concentration-sensitive target genes is restricted to 432 20 anterior regions of the embryo. The higher density of Bcd binding sites in highly 433 concentration-sensitive target enhancers (shown in Figure  2B) suggests that these 434 enhancers may require a larger number of Bcd molecules to be bound at a given time to 435 keep the enhancers free of nucleosomes and accessible to the additional regulatory 436 factors. We therefore provide a model for morphogen function in which the 437 concentration thresholds in the gradient are read out molecularly at the level of 438 chromatin accessibility, rather than through the strength of binding sites in the target 439

enhancers. 440
It is important to note that the discrete sensitivity classes described here were 441 generated by Bcd binding data, and this binding occurs prior to the activation of target 442 genes and refinement of their expression domains. In our model Bcd establishes these 443 initial patterns not by competing with its own target genes, but with default nucleosome 444 positions in the early embryo. We predict that this initial interaction with chromatin is an

Relationship of Bicoid and Zelda at Bicoid-bound enhancers 453
The prominence of the Zld binding motif in the Bcd-bound ChIP peaks and 454 ATAC-seq in zld mutants reveals that Zld also contributes to the accessibility of Bcd 455 21 targets in the genome, in part at those targets that are not dependent on Bcd for their 456 accessibility (61/1,027 peaks are dependent on both Bcd and Zld for accessibility). Zld 457 is therefore likely to be one component that influences the accessibility and therefore 458 the apparent in vivo affinity of the enhancers that are bound by Bcd but insensitive to its 459 local concentration (Figure  7). Previous work has suggested that Zld contributes to The reduced binding by a truncated Bcd protein at the most concentration-480 sensitive targets indicates that Bcd does not displace nucleosomes by simply by 481 competing for binding to genomic targets, but rather that the C-terminus of the Bcd 482 protein is required for accessing its nucleosome-associated DNA targets. or are inherently more likely to be nucleosome-free based on their underlying sequence. 497 It has previously been suggested that transcription factors can compete with 498 nucleosomes for access to their DNA binding sites (Mirny, 2010;; Wang et al., 2011). 499 This could occur through cooperative binding to nucleosome-associated enhancers: if 500 23 one Bcd molecule could gain access to a binding site that was protected by a 501 nucleosome, it could recruit additional Bcd protein molecules to bind to nearby sites and 502 occlude nucleosome binding. This cooperativity would require a high concentration of 503 Bcd protein, fitting with our observation that Bcd influences accessibility more strongly 504 at high concentrations. However, our experiments with a truncated Bcd protein reveal 505 that Bcd cannot bind to its most sensitive targets without its C-terminal domains. As 506 many of the residues that have been implicated in cooperative binding reside in the Bcd 507 homeodomain (Burz and Hanes, 2001), we would expect this truncated Bcd to bind 508 cooperatively. This finding therefore supports a model in which Bcd is actively 509 remodeling chromatin, either directly or more likely by interacting with chromatin 510 remodeling factors through its C-terminus. 511 As a maternally supplied factor, Bcd provides one of the first cues to the break 512 the symmetry of the embryonic body plan. Our results suggest that this symmetry 513 breaking occurs first at the level of chromatin accessibility, as Bcd drives the opening of 514 its most concentration-sensitive target enhancers in anterior nuclei. Another maternal 515 factor, Zld, is proposed to act as a pioneer factor at early embryonic enhancers with a 516 high intrinsic nucleosome barrier. By binding to these enhancers, Zld depletes them of 517 nucleosomes and allows patterning transcription factors to bind and activate gene 518 expression (Sun et al., 2015). We have demonstrated here that Bcd influences the 519 accessibility primarily of its concentration-sensitive targets, which also exhibit a high 520 predicted nucleosome barrier ( Figure  4B). This raises the possibility that Bcd may be 521 exhibiting pioneer-like activity at high concentrations, driving accessibility of these sites 522 prior to transcriptional activation. It is unlikely that Bcd is unique in its ability to influence 523 24 the local chromatin accessibility of its targets. Recent work in mouse embryos has 524 shown that another homeodomain transcription factor, Cdx2, influences the chromatin 525 accessibility of its targets during posterior axial elongation (Amin et al., 2016). This may 526 be a common property of developmental transcription factors that must gain early 527 access their target enhancers while the chromatin state of the genome is being 528 remodeled during large-scale transitions in the gene regulatory landscape. 529 530

EXPERIMENTAL PROCEDURES 531
Fly stocks and Genetics 532 bcd mutants refers to embryos derived from bcd E1 homozygous mothers. The bcd E1 and 533 bcd E1 nos L7 tsl 4 stocks were from the Wieschaus/Schüpbach stock collection maintained 534 at Princeton University. zld mutants are embryos derived from zelda 294 germline clones. 535 Zelda mutant embryos were generated from the zld 294 allele (kind gift of Christine 536 Rushlow) as germline clones as described previously (Blythe and Wieschaus, 2015).

Transgenic Constructs 563
The uniform Bcd constructs were generated using a pBABR plasmid containing an N-564 terminal GFP-tagged bcd cDNA in which the bcd 3'UTR was replaced by the sqh 3'UTR 565

EMSAs and K d Calculations 639
EMSAs were performed using purified Bicoid homeodomain and biotin-labeled DNA 640 probes were designed to span ~200 bp in the maximal peak region of Bcd-bound peaks 641 identified by ChIP and corresponding to previously characterized enhancers. Effective 642 K d values for each enhancer probe were calculated using the ratio of total shifted probe 643 to free probe. 644

Primers
Sequence (   higher ATAC-seq counts (i.e., regions that were more accessible) than the ATAC-seq 675 counts in the Bcd ChIP-peaks. We determined that at a ratio of 5.4 ATAC-seq/ChIP-seq 676 counts, 95% of the ChIP peaks (permutation test p value = 0.05) were no more open 677 than a random selection of open regions. We filtered out the remaining ChIP peaks with 678 ATAC/ChIP ratios above 5.4, as these peaks are more likely to correspond to highly 679 transcribed open regions where most false positive signals can be found. We then 680 chose the peaks that were common to wild-type and uniform Bcd embryos, which 681 resulted in a list of 1,027 Bcd-bound peak regions. The number of peaks at each step of 682 this filtering is shown in Table  S3.

Comparing Binding Between Uniform Bcd Levels 686
Mapped BAM files were imported into R as GenomicRanges objects (Lawrence et al.,687 2013), filtering out reads with map quality scores below 30. Significant differences 688 between the uBcd levels were assessed on a pairwise basis using edgeR (Robinson et 689 al., 2010) in the 1,027 pre-defined peak plus 50,000 additional non-peak noise regions 690 selected from the dm3 genome. 691

Data Normalization and Display 692
Sequencing data was z-score normalized for display in heatmaps. Sequencing read 693 count coverage was calculated for 10 base pair windows across the genome, and the 694 mean counts per million reads were determined in each ChIP peak, as well as the 695 additional noise peaks. Z-scores were computed for each peak using 696 = − 697 where = mean CPM in noise peaks and = standard deviation of CPM in noise 698 reporters. The reporters expressed at stage 4-6 that overlapped with more than one Bcd 702 peak were excluded from the plot in Figure  2B. 703 704

Sample Collection 706
Live embryos expressing a histone (H2Av)-GFP or RFP construct were individually 707 staged on an epifluorescence microscope in halocarbon oil. After the onset of nuclear 708 cycle 14, single embryos were dechorionated in bleach and macerated in cold lysis 709 buffer at t = 12 minutes into NC14. Samples were pelleted in lysis buffer at 4°C (3000 710 rpm for 10 minutes), buffer was removed, and the embryo pellet was flash frozen on dry 711 for comments on the manuscript, and W. Wang    (A) ChIP-seq data in Bcd-bound peaks. Data is displayed as a heatmap of z-score normalized ChIp-seq reads, in a 2 kb region centered around each peak. Peaks in each class are arranged in order of decreasing z-scores in wild-type embryos. One peak (peak 549, see Table S6)  (C) Top DNA motifs discovered by RSAT peak-motifs. The e-value for is a p-value computed from a binomial distribution for a given motif in the dataset, corrected for multiple testing. See Figure S2B for de novo motif discovery in each sensitivity class.
(D) Heatmap displaying the enrichment of a given motif in each sensitivity class, relative to the peak list as a whole. P-values were generated from permutation tests (n = 10,000 tests).