Genome-wide analysis of the H3K27me3 epigenome and transcriptome in Brassica rapa

Abstract Background Genome-wide maps of histone modifications have been obtained for several plant species. However, most studies focus on model systems and do not enforce FAIR data management principles. Here we study the H3K27me3 epigenome and associated transcriptome of Brassica rapa, an important vegetable cultivated worldwide. Findings We performed H3K27me3 chromatin immunoprecipitation followed by high-throughput sequencing and transcriptomic analysis by 3′-end RNA sequencing from B. rapa leaves and inflorescences. To analyze these data we developed a Reproducible Epigenomic Analysis pipeline using Galaxy and Jupyter, packaged into Docker images to facilitate transparency and reuse. We found that H3K27me3 covers roughly one-third of all B. rapa protein-coding genes and its presence correlates with low transcript levels. The comparative analysis between leaves and inflorescences suggested that the expression of various floral regulatory genes during development depends on H3K27me3. To demonstrate the importance of H3K27me3 for B. rapa development, we characterized a mutant line deficient in the H3K27 methyltransferase activity. We found that braA.clf mutant plants presented pleiotropic alterations, e.g., curly leaves due to increased expression and reduced H3K27me3 levels at AGAMOUS-like loci. Conclusions We characterized the epigenetic mark H3K27me3 at genome-wide levels and provide genetic evidence for its relevance in B. rapa development. Our work reveals the epigenomic landscape of H3K27me3 in B. rapa and provides novel genomics datasets and bioinformatics analytical resources. We anticipate that this work will lead the way to further epigenomic studies in the complex genome of Brassica crops.


Background
Genome-wide maps of histone modifications have been obtained for several plant species. However, most studies focus on model systems and do not enforce FAIR data management principles. Here we study the histone H3 lysine 27 trimethylation (H3K27me3) epigenome and associated transcriptome of Brassica rapa, an important vegetable cultivated world-wide.

Findings
We performed H3K27me3 chromatin immunoprecipitation followed by high-throughput sequencing and a transcriptomic analysis by 3'-end RNA sequencing from B. rapa leaves and inflorescences. To analyze these data we developed a reproducible epigenomic analysis pipeline using Galaxy and Jupyter, wrapped into Docker images. We found that H3K27 methylation covers about a third of all B. rapa protein-coding genes and its presence correlates with low transcript levels. The comparative analysis between leaves and inflorescences suggested that the expression of various floral regulatory genes during development depends on H3K27 methylation. To demonstrate the importance of H3K27me3 for B. rapa development, we characterized a mutant line deficient in the H3K27 methyltransferase activity. We found that braA.clf mutant plants presented pleiotropic alterations, e. g. curly leaves due to increased expression and reduced H3K27me3 levels at AGAMOUS-like loci.

Conclusions
We characterized the epigenetic mark H3K27me3 at genome-wide levels and provide genetic evidence for its relevance in B. rapa development. Our work reveals the epigenomic landscape of H3K27me3 in B. rapa and provide novel genomics datasets and bioinformatics analytical resources. We anticipate that this work will lead the way to further epigenomic studies in the complex genome of Brassica crops.

Abstract Background
Genome-wide maps of histone modifications have been obtained for several plant species. However, most studies focus on model systems and do not enforce FAIR data management principles. Here we study the histone H3 lysine 27 trimethylation (H3K27me3) epigenome and associated transcriptome of Brassica rapa, an important vegetable cultivated world-wide.

Findings
We performed H3K27me3 chromatin immunoprecipitation followed by high-throughput sequencing and a transcriptomic analysis by 3'-end RNA sequencing from B. rapa leaves and inflorescences. To analyze these data we developed a reproducible epigenomic analysis pipeline using Galaxy and Jupyter, wrapped into Docker images.
We found that H3K27 methylation covers about a third of all B. rapa protein-coding genes and its presence correlates with low transcript levels. The comparative analysis between leaves and inflorescences suggested that the expression of various floral regulatory genes during development depends on H3K27 methylation. To demonstrate the importance of H3K27me3 for B. rapa development, we characterized a mutant line deficient in the H3K27 methyltransferase activity. We found that braA.clf mutant plants presented pleiotropic alterations, e. g. curly leaves due to increased expression and reduced H3K27me3 levels at AGAMOUS-like loci.

Conclusions
We characterized the epigenetic mark H3K27me3 at genome-wide levels and provide genetic evidence for its relevance in B. rapa development. Our work reveals the epigenomic landscape of H3K27me3 in B. rapa and provide novel genomics datasets and bioinformatics analytical resources. We anticipate that this work will lead the way to further epigenomic studies in the complex genome of Brassica crops.

Background
Epigenetic regulation, defined as altered heritable gene expression in the absence of DNA sequence changes, is essential for development and differentiation [1]. The epigenome comprises alternative chromatin states that can impact gene activity. These include DNA methylation, the incorporation of histone variants and the post-transcriptional modification of histoneslike acetylation or methylation on residues in the histone tails which can ultimately modify the interaction with DNA.
Epigenetic marks accumulate in response to internal and environmental cues and persist through mitosis during the life span of the organism. A remarkable finding is the extent of the evolutionary conservation in key regulators and mechanisms across the plant and animal kingdoms suggesting that a very ancient mechanism underlies this epigenetic regulation [2].
The trimethylation of histone H3 lysine 27 (H3K27me3) [2][3][4] is one of the best examples of epigenetic regulation of the gene expression programs. H3K27me3 generally anticorrelates with gene repression and marks the so-called facultative heterochromatin, a fraction of the genome where gene expression is repressed but can be activated in response to developmental or environmental signals [5]. This epigenetic mark is deposited at target genes by specific histone methyltransferases as part of the Polycomb repressive complex 2 (PRC2). The PRC2 complex, which is conserved from animal to plants, comprises a set of core components and several accessory subunits [5,6]. In the model plant Arabidopsis thaliana (hereinafter referred to as Arabidopsis) the core PRC2 subunits are well conserved and H3K27 methylation is crucial for plant development [3]. During plant growth, different sets of genes are expressed at different organs or tissues and the PRC2 complex is needed to maintain these gene expression patterns [5]. The exact mechanism through which H3K27 methylation represses gene expression is not fully understood [5,6], but H3K27me3 is considered a hallmark of gene repression because it is tightly associated with gene silencing.
In plants, H3K27 methylation is crucial for developmental transitions like gametophyte formation, seed germination and floral initiation [3,4]. In Arabidopsis, at least 20% of protein-coding genes are covered by H3K27me3 in a given organ [7,8]; and similar results have been obtained in rice, maize and the model cereal Brachypodium distachyon, suggesting a conserved role of this epigenetic mark in plant development [9][10][11]. The importance of this histone modification is highlighted by recent reports showing that up to 60% of protein-coding genes are silenced by H3K27me3 at some specific plant cell types [12].
The Brassicaceae or Cruciferae family includes Arabidopsis and several important crops. The Brassica genus includes a number of condiments and vegetables as well as economically important oilseed crops. Brassica crops have complex genomes that underwent a whole genome triplication with subsequent genome rearrangements and chromosome reduction after the divergence from a common ancestor with Arabidopsis about 15 million years ago [13,14]. Therefore, the mesohexaploid Brassica genomes like Brassica rapa (turnip, field mustard; genome AA), Brassica nigra (black mustard; genome BB) and Brassica oleracea (cabbage; genome CC) are predicted to encode up to three orthologs of each Arabidopsis gene.
Within the Brassica genus, the diploid B. rapa is considered a model for genomic studies because it has a small genome size that makes up half of the genomes of the allotetraploids Brassica juncea (indian mustard; AABB) and Brassica napus (rapeseed; AACC), which are relevant oilseed crops worldwide. B. rapa displays an extreme morphological diversity and includes economically important vegetables and oilseed crops [15,16]. In addition, the B. rapa (Chinese cabbage, Chiifu-401) genome has been fully sequenced and annotated and hundreds of accessions have been re-sequenced [14,16,17].

Data Description
In this work we obtained the genome-wide profile of the epigenetic mark H3K27me3 in B. rapa. We performed chromatin immunoprecipitation followed by highthroughput sequencing (ChIP-seq) from leaves and inflorescences. To correlate histone methylation and messenger RNA (mRNA) levels, we extracted RNA from the same plant samples and 3'-end RNA sequencing (3'RNA-seq) was performed. To uncover the relevance of H3K27me3 in B. rapa development, we characterized a tilling mutant line deficient in the H3K27 methyltransferase BraA.CLF. ChIP-seq data (225 million paired-end fragments) and 3'RNA-seq data (75 million single-end reads) were archived at NCBI Sequence Read Archive under the accession number PRJNA542357.

Analyses A Reproducible Epigenomic Analysis pipeline
To enhance compliance with the FAIR principles (findability, accessibility, interoperability, and reusability) for scholarly digital objects [18], we designed a Reproducible Epigenomic Analysis (REA) pipeline for ChIP-seq and RNA-seq using Galaxy [19], an open web-based platform where each analytical step is formally documented and can be shared and reproduced. We generated Galaxy workflows for the analysis of both ChIP-seq data and RNA-seq data. These workflows were executed on a locally administered Galaxy server via a Docker container image [20]. The Docker technology allowed us to bundle all Galaxy components and tools into a distributable package, which is publicly available for download and execution. Analytical steps that could not be integrated within a Galaxy workflow were captured and documented in Jupyter notebooks [21], an open-source interactive computing environment that allows sharing of code, documentation, and results. These notebooks are also available within a Docker image that runs Jupyter. The REA pipeline is available in the GitHub repository https://github.com/wilkinsonlab/epigenomics_pipeline and in the associated Zenodo release (doi:10.5281/zenodo.3298028).
In Fig. 1, a schematic representation of our REA pipeline is shown. We used well-established tools including Bowtie2 [22] for short-read sequence alignment, HTSeq [23] for feature mapping quantification, epic2 [24] for ChIP-seq peak calling, MAnorm [25] for quantitative comparison of ChIP-seq data, and DESeq2 [26] for differential gene expression analysis. A detailed description of these workflows can be found in the Methods section.

Genome-wide identification of H3K27me3 regions in B. rapa
To study the epigenetic landscape of histone H3K27 trimethylation in B. rapa we performed ChIP experiments from leaf samples. Immunoprecipitated chromatin and Input (chromatin extracts not subjected to immunoprecipitation) DNA libraries were sequenced by Illumina technology at 125 bp paired-end reads; more than 30 million fragments were obtained for each sample (Table S1) and sequencing data were analyzed using the REA pipeline. Raw paired-end reads were trimmed to remove low quality bases and short reads, with over 99% of ~123 bp-long reads being maintained.
Two of the most commonly used aligners for ChIP-seq analysis are BWA [27] and Bowtie2, which carry out fast mapping of DNA sequences using the Burrows-Wheeler transform method. We tested which of these aligners best suited our samples using the latest B. rapa v3.0 Chiifu-401 genome as reference [14,17]. Although BWA yielded comparable results, we obtained a small increase on paired-end mapping efficiency using Bowtie2 (Table S2), and this algorithm was therefore used for the remainder of our genomic analyses in B. rapa. Mapping with Bowtie2 against B. rapa v3.0 genome [17] yielded an average 82% mapping rate, where 42-61% of the reads mapped to multiple locations, likely reflecting the mesopolyploid nature of the B. rapa genome or the abundance of repeated DNA elements. After mapping, duplicated reads were removed and ChIP-seq signal distribution over B. rapa genes was visualized and inspected using the Integrative Genomics Viewer (IGV) [28].
Then, we determined the overall distribution patterns of H3K27 methylation on B. rapa using the REA pipeline. A metagene plot of H3K27me3 ChIP-seq signal showed that, as described in other plant species, H3K27 methylation is not enriched at promoter regions but rather covers B. rapa coding regions, showing little preference for the 5' versus 3' ends of genes ( Fig. 2A). A heat map representation of the genomewide ChIP-seq signal showed that H3K27me3 accumulation displays a gradual variation between genes: with some genes showing high histone methylation density, and others relatively low H3K27me3 levels (Fig. 2B). To precisely determine the location of H3K27me3-marked regions across the genome, we performed a ChIP-seq peak calling analysis adjusting software settings for detection of broad range peaks.
We compared two widely used but different peak-calling algorithms: MACS2 [29], an algorithm initially designed to identify sharp peaks but extended to detect broad peaks such as those arising from this analysis; and epic2, a highly performant implementation of SICER [30], an algorithm designed for noisy and diffuse ChIP-seq data such as histone methylation. Both peak finding algorithms detected a comparable number of peaks (MACS2 20,000 and epic2 15,000 peak regions). However epic2 was able to detect wider histone methylated regions than MACS2 (Fig. S1), with a mean peak size of 2,991 bp for epic2 compared to 1,985 bp for MACS2, and was the preferred tool for studying diffuse epigenetic marks in our study.
Our REA pipeline identified 15,136 H3K27me3 marked regions in B. rapa leaves using epic2 (Tables S3 and S4), where 68% overlapped with one or more protein-coding genes ( Fig. 2C and D, Table S3). We found 12,480 genes marked with H3K27me3, which represents 27% of all B. rapa protein-coding genes. In animals H3K27 methylated regions cover hundreds of kilobases and usually span several genes [6,7]. We found that the average size of B. rapa H3K27me3-marked regions was about 3 kb, and that 55% of these peak-regions were associated with single genes ( Fig. 2C and D, Table S3). This is similar to what has been described for Arabidopsis [7]. However, consistent with the complexity of the B. rapa genome, we also found about 2,000 multigenic regions including some large H3K27 methylated regions spanning more than 10 kb (peak regions list is provided in Table S4). In summary, our analysis indicates that H3K27 methylation marks a great portion of the B. rapa genome but it is mainly associated with single genes.

H3K27me3 correlates with low transcript levels in B. rapa
To identify the correlation between H3K27me3 and gene expression, we performed 3'RNA-seq on the same leaf samples used for ChIP-seq. Profiling RNA levels by 3'RNA-seq and other tag-based methods usually gave higher dynamic detection range and improved mRNA quantification than full transcriptome sequencing [31]. Following the REA pipeline, quantitative mRNA data from B. rapa leaves was obtained (Table S5). After normalization of expression data, genes were sorted into four categories according to their mRNA levels and used to separately represent their average ChIP-seq H3K27me3 signal on a metagene plot. As shown in Fig. 3A, genes with high levels of H3K27me3 usually exhibit low or no expression, indicating that, as in most eukaryotes, H3K27 methylation in B. rapa is associated with gene silencing.
To select for genes potentially regulated by H3K27me3 in B. rapa leaves, we selected high quality peak regions where the ChIP signal determined by epic2 was larger than one-fold increase over the background (FC>1). This analysis identified 8,510 genes that were used as input to a Singular Enrichment Analysis (SEA) of Gene Ontology (GO) terms using agriGO [32]. The resulting GO term list was summarized and reduced in complexity using REVIGO [33]. The results in Fig. 3B show that, in B. rapa, H3K27me3-marked genes are enriched in biological process categories related to metabolism (GO:0006807, GO:0019222, GO:0044237), cell differentiation (GO:0030154) and regulation of gene expression (GO0010468). In addition, H3K27me3-marked genes were also enriched in response to stimulus (GO:0009991) and response to stress (GO:0006950) categories, suggesting that this epigenetic mark plays a role coordinating genomic responses to external cues. In several eukaryotic organisms, H3K27me3-mediated silencing is related to the transition between developmental programs [2,3]. Remarkably, we found that a significant number of genes repressed by H3K27me3 in B. rapa leaves were related to developmental processes (GO:0032502), and more specifically to flower development (GO:0009908) categories. These data suggest that H3K27 methylation contributes to the silencing of floral developmental genes in B. rapa leaves.

Dynamic changes in H3K27me3 are associated with the floral transition in B. rapa
The transition from vegetative to reproductive development is a crucial step in the plant-life cycle [34]. This process is highly regulated and involves a genome-wide transcriptional reprogramming where different developmental programs are activated or repressed [8,12]. To determine the role of H3K27me3 deposition in B. rapa floral transition, we performed H3K27me3 ChIP-seq and RNA-seq analysis on B. rapa inflorescences and analyzed the sequencing data using our REA pipeline (Tables S5   and S6). We identified 7,436 high-confidence epic2 peaks (FC>1) in B. rapa inflorescences that showed a great overlap with leaf peaks (Fig. S2). To compare H3K27 methylated genes between leaves and inflorescences, we selected the highconfidence peaks from these two organs and performed a quantitative ChIP-seq signal comparison using MAnorm. Setting a |M|>0.5 cut-off (M = log2 fold-change of the normalized read densities from leaf vs inflorescences), we found 5,986 differentially H3K27me3-marked regions between leaves and inflorescences including 4,729 differentially marked genes ( Fig. 4A and Table S7). Thus, about 55% of all 10,726 highconfidence peaks (merged data from both leaves and inflorescences, Fig. S2) showed a significant histone methylation change between these two plant organs.
H3K27 methylation changes are not always correlated with gene expression changes [2,6]; e.g., a reduction of H3K27me3 levels may not be enough to promote transcription in the absence of a specific transcription factor. To examine the extent to which the observed H3K27me3 differences in B. rapa were correlated with the gene expression changes required for the formation of flowers, differentially expressed genes (DEG) between leaves and inflorescences were determined with our RNA-seq data analysis workflow. Using DESeq2 under our REA pipeline, we found 13,377 DEG (Tables S5 and Fig. 4B; |log2FC|>0.5) between B. rapa leaves and inflorescences, indicating that about 29% of B. rapa protein-coding genes are differentially expressed between leaves, a vegetative tissue, and inflorescences, the reproductive organs of the plant. To study the correlation between differential H3K27me3 deposition and differential mRNA levels between leaves and inflorescences, we represented the differences in H3K27me3 signal (MAnorm M value) against the differential expression levels (DESeq2 log2FC) between leaves and inflorescences (Fig. 4C). This revealed 1,724 genes (4% of all gene models) that significantly changed in both their H3K27me3-mark and their mRNA levels. Of these, 729 loci showed higher H3K27me3 signal in leaves relative to flowers, as well as lower gene expression in leaves relative to flowers (Table S8) [35]. The list of H3K27 regulated genes in Table S8 includes a number of B. rapa MADS-box genes with homology to floral homeotic genes like AGAMOUS, APETALA1, CAULIFLOWER, PISTILLATA and SEPALLATA genes, or involved in the floral transition like AGL19, FRUITFUL and SOC1 [36]. All these data suggest that H3K27 methylation is important for the expression of floral regulatory genes and, eventually, the proper development of the reproductive structures in B. rapa.

H3K27me3 deposition is crucial for B. rapa development
In Arabidopsis, there are three H3K27 methyltransferases: the main one is CURLY LEAF (CLF) [3,37], which is partially redundant with SWINGER [38], and MEDEA which is only expressed in the female gametophyte and seeds [39]. CLF mutations result in defective floral morphology, early flowering and severe leaf developmental alterations [37]; these developmental defects are due to misexpression of H3K27me3-silenced target genes [38,40].
In the B. rapa genome there is one CLF, one SWINGER, and two MEDEA-like homologs [41]. To determine the effect of impaired H3K27 deposition on the development of B. rapa, we studied the function of BraA.CLF (BraA04g017190.3C).
Consistent with a possible role as a general H3K27 methyltransferase across the plant, we found that BraA.CLF (BraA04g017190.3C) gene is expressed in B. rapa leaves and flowers (Fig. S3). We obtained a tilling mutant line [42], braA.clf-1 (Gln615Stop), that produced a truncated protein without the catalytic domain. As shown in Fig. 5, braA.clf-1 has a smaller overall size, reduced expansion of flowers, and curled sepals and petals ( Fig. 5 and S4A). In addition, about 5% of flowers showed homeotic transformation of floral organs (Fig. S4B). However, the most striking phenotype of the mutant was the upward curled leaves that were most severe on younger leaves (Fig. 5 and S4A). All these developmental abnormalities are similar to the Arabidopsis clf mutant, suggesting a high degree of functional conservation between both species.
The upward curved leaves in Arabidopsis clf mutant are due to the upregulation of the floral identity gene AGAMOUS (AG), which is a direct CLF target in the leaf [37,43]. In our genomic analysis (Table S8), we found that the two B. rapa AG loci [44], BraA01g010430.3C (BraA.AG.a) and BraA03g048590.3C (BraA.AG.b), showed lower expression and higher H3K27 methylation in leaves than inflorescences. Thus, we wondered whether BraA.CLF may be repressing these two genes in B. rapa leaves.

We performed ChIP experiments and found that H3K27me3 levels were reduced at
BraA.AG.a and BraA.AG.b loci in mutant braA.clf-1 leaves (Fig. 6A and B). This reduction in H3K27me3 was associated with increased BraA.AG.a and BraA.AG.b mRNA levels in braA.clf-1 mutant leaves as determined by real-time quantitative reverse transcription polymerase chain reaction (RT-qPCR) (Fig. 6C). All these data indicate that BraA.CLF is a major histone methyltransferase regulating the deposition of H3K27me3 at key developmental genes in B. rapa; and, more importantly, that the correct H3K27 methylation is crucial for the optimal growth of Brassica crops.

Discussion
In this work, we determined the genome-wide and organ-specific distribution of H3K27me3 together with the transcriptome dynamics in leaves and inflorescences of B. rapa R-o-18, an oilseed cultivar. We found that H3K27me3 is present in a large number of genomic locations across the B. rapa genome, mainly associated with protein-coding genes. Interestingly, SEA analyses showed that H3K27me3 marked genes are enriched in GO terms related to gene regulation, such as "regulation of gene expression" (GO:0010468). In parallel with our ChIP-seq experiments, to correlate histone modification levels with transcript levels, we performed mRNA quantification by 3'RNA-seq. As in other organisms, we found that H3K27me3 is anticorrelated with mRNA levels indicating that this epigenetic mark is also a hallmark for gene inactivation in B. rapa. All these data suggest that H3K27me3 plays an important role in the regulation of gene expression in Brassica crops.
Our analysis also revealed that a number of genes related to floral development The Brassica genus contains a diverse collection of economically important crop species, including the third-highest produced oil crop in the world and constituting an increasingly popular source of nutrients due to their anticancer, antioxidant, antiinflammatory properties and high nutritional value [15]. The B. rapa genome was the first sequenced Brassica crop [14], but in recent years the genome sequences of cabbage (B. olearacea), black mustard (B. juncea) and rapeseed (B. napus) have been released [45]. The comprehensive study of H3K27 methylation in B. rapa presented here opens the door to explore other Brassica crops with more complex genomes. We hope that the epigenomic datasets and bioinformatic analytical pipelines generated in this work will aid future studies to shed more light on the epigenetic regulation in plants.

Potential implications
Here we presented novel H3K27me3 epigenomic and transcriptomic B. rapa datasets. To understand the molecular mechanisms of epigenetic regulation in plants is important as fundamental knowledge but has also enormous implications for crop improvement and agriculture [46]. DNA sequence variability alone cannot explain all the diversity of plant phenotypes [47]. The isolation of stable epigenetic variants of several crops has led to the idea that epigenetics could contribute to heritable natural variation that can be selected in plant breeding and crop improvement programs [46,47].
On the other hand, to analyze our data we designed a bioinformatic pipeline for ChIP-seq and RNA-seq, the Reproducible Epigenomic pipeline, and made this publicly available adhering to the FAIR data principles. The pipeline was constructed using a set of well-established genomic tools and approaches, using a combination of a Galaxy environment and Jupyter notebooks [21]. Both Galaxy and Jupyter provide analytics environments that can be reproduced by anyone through a user-friendly Web interface, and ensures transparency and reusability/reproducibility of our analyses and outcomes, as well as providing a platform for others to execute similar analyses following our approach.

Chromatin Immunoprecipitation and DNA sequencing
ChIP experiments were performed using B. rapa R-o-18 wild-type primary leaves collected 14 days after germination (Fig. S5A), and inflorescences collected when the first flowers started to open ( Fig. S5B and C). Sampling was performed at the end of the light period (zeitgeber time ZT16). Each biological replicate comprises samples from 6 independent plants. ChIP experiments were performed using 2 g of tissue as described at our Plant Chromatin Immunoprecipitation protocol V.2 [48]. We used a specific antibody against H3K27me3 (Millipore 07-449, Lot No. 2736613.). The oligonucleotide sequences used for the ChIP-qPCR amplification reactions can be found in Table S9.
ChIP-seq sequencing libraries were prepared using NEBNext Ultra DNA Library Prep kit (New England BioLabs 7370) starting from 4 ng of immunoprecipitated DNA.
One inflorescence and two leaves ChIP-seq biological replicates were sequenced at 2x125 bp paired-end reads in an Illumina HiSeq 2500 platform using the SBS v4 kit (Illumina) at the Genomics Units of CNAG-CRG (Barcelona, Spain).

Gene expression profiling and 3'RNA sequencing
Total RNA was extracted from the same plant tissues than ChIP using EZNA Plant RNA Kit (Omega Bio-tek) following the manufacturer's recommendations. Each biological replicate contained samples from 6 independent plants. Complementary DNA (cDNA) was prepared by reverse transcription of 1 µg of total RNA using "Maxima cDNA Kit with dsDNase" (Thermo Fisher Scientific) according to the manufacturer instructions, and quantification was performed by RT-qPCR using the LightCycler 480 and SYBR Green I Master mix (Roche). Gene expression was determined by 2 -ΔΔCT method using BraA.TUBULIN (BraA10g026070.3C) [49] as housekeeping gene. The oligonucleotide sequences used for qPCR primers can be found in Table S9.
Libraries for 3'RNA-seq were prepared using a MARS-seq protocol adapted for bulk RNA-seq [50,51] with minor modifications. Briefly, 100 ng of total RNA were reverse-transcribed using poly-dT oligos carrying a 7 bp-index. cDNAs were then pooled and subjected to linear amplification via in vitro transcription. The resulting amplified RNA was fragmented and dephosphorylated. Ligation of partial Illumina adaptor sequences [50] was followed by a second reverse-transcription and full Versions and settings of tools used in Galaxy are noted in the workflows published in our DockerHub images, as well as described below.
To facilitate reproducibility of our data analyses, we built and published two Docker images, which contained all dependencies and software requirements to create a ready-to-run analytical workflow (see Availability of Supporting Data). The first image contains a Galaxy instance with required tools installed and accessory files to download/index the B. rapa reference genome and prepare/run workflows with the analytical steps described below. These workflows are designed to download raw sequencing reads from SRA and export the results locally for further processing. The second image, which runs Jupyter Lab, includes software installations and notebooks with instructions to finalize data analysis and explore results; they include, as a default view, the results published here. These images are designed to operate on a shared local directory for Jupyter to access Galaxy results. Detailed instructions on how to deploy and run each container can be found in our GitHub repository README file.
To enhance our compliance with FAIR publishing requirements, metadata files

ChIP-seq analysis
Sequencing data was uploaded to a local Galaxy instance in fastqsanger format. Each sample was independently processed and replicates pooled for peak Results from data analysis on Galaxy were downloaded locally for further processing and visualization. The B. rapa genome v3.0 contains 10 chromosomes but also thousands of smaller scaffolds, with sequences from either nuclear chromosomes or chloroplast/mitochondria organelles. Thus, prior to differential analysis of H3K27 methylation between different leaves and inflorescences, raw peak calling results from epic2 were curated via selection of nuclear regions and visual inspection on IGV viewer. Then, H3K27me3 mark intensities, from pooled quality-filtered aligned reads (keeping duplicates), over peaks were compared on leaves vs inflorescences with MAnorm v1.2.0. Finally, the annotation of peaks overlapping B. rapa gene models was done with ChIPpeakAnno [57], and the distribution of ChIP signal over genes was visualized with ngs.plot [58].

RNA-seq analysis
The bulk transcriptomic analysis was performed using a specific RNA-seq analysis workflow on a local Galaxy instance. Single-end reads were trimmed with Trimmomatic and mapped with Bowtie2 against B. rapa v3.0 genome. Before counting, available B. rapa v3.0 gene models, which only comprise the coding sequences, were extended by 300 bp at each gene's 3' UTR regions and used to quantify aligned reads using HTSeq-count script with stranded and intersection of non-empty sets options.
The obtained counts were used for mRNA differential expression analysis with DESeq2 1.18.1 between leaves and inflorescences. Expression results were downloaded locally for further analysis outside Galaxy in combination with annotated peaks. Categorization of genes by expression level was achieved after transformation of count data into zscores as follows: for each expressed gene and tissue, normalized counts were averaged across replicates, subtracted the dataset average and divided by dataset standard deviation. These z-scores were used to define genes with low (z < -0.5), medium (-0.5 < z < 0.5), and high expression (z > 0.5).

Other Bioinformatic analyses
Gene Ontology analysis was performed using agriGO v2.0 (Fisher statistical test method; Yekutieli Multi_test adjustment method; p < 0.05; and Plant GO slim ontology type); data was visualized reduced in complexity and redundant GO terms using REVIGO with default parameters (allowed similarity = 0.7; semantic similarity measure = SimRel).
Functional annotation of B. rapa v3.0 gene models was obtained from [17].      Additional file 1 Figure S3: Transcript levels of B. rapa H3K27 methyltransferase encoding-genes. Graphical representation of the data (normalized counts) from our end of gene; inside, peak located within gene; includeFeature, gene is located within peak.
Additional file 3 Table S7. List of H3K27me3 differentially marked regions.
Differentially marked peaks were identified with MAnorm. M_value, log2FC of normalized read densities; A_value, average signal strength; Peak_Group, origin of peaks (one tissue or common).