Variability in the Control of Cell Division Underlies Sepal Epidermal Patterning in Arabidopsis thaliana

Live cell imaging and computational modeling explains how variability in the timing of cell division generates a characteristic pattern of cell sizes during development.

S4 entry clone. H2B-mGFP was excised from pAR179 with Bam HI and Sac I and cloned into pAR95 to created pAR181 pATML1::H2B-mGFP with Kanamycin resistance in plants. A gateway epidermal specific expression vector pAR176 was generated by replacing the 35S promoter from pB7WG2.0 [5] with the Hind III / Spe I linker from pCR-Blunt II-TOPO (Invitrogen) to make pAR175. The ATML1 promoter was excised from pAS99 and cloned into the Hind III site of pAR175 to make pAR176. The H2B-mGFP fusion protein in pAR179 was recombined into pAR176 using a gateway LR reaction to generate pAR180 pATML1::H2B-mGFP with Basta resistance in plants. pAR180 and pAR181 were electroporated into Agrobacterium and transformed into Ler, lgo-1, and pATML1::KRP1 plants by floral dipping.
LGO from pAR173 was recombined into the 35S vector pB7WG2 [5] to create pAR174 (35S::LGO). Agrobacterium transformed with pAR174 were used to create transgenic Ler plants by floral dipping. Transgenic plants were selected for Basta resistance. The phenotype was strong in T1 plants, but was silenced in subsequent S5 generations.
LGO was also ectopically expressed throughout the epidermis under control of the ATML1 promoter by recombining LGO from pAR173 into the pATML1 expression vector pAR176 (see above) to created pAR178 (pATML1::LGO). Transgenic Ler plants were created through floral dipping in Agrobacterium transformed with pAR178. Transgenic plants were selected for Basta resistance.

Scanning electron microscopy
Stage 14 flowers were prepared as described [7] and viewed on a LEO 1550 VP FESEM. Giant cells identified by their morphology and slight protrusion from the plane of the sepal were false colored in red with Photoshop CS.

Details of live imaging procedure
The conception of the live imaging experiment was based on [8,9], but the experimental details of plant growth and manipulation were altered to observe the lateral side of sepals instead of apex of the meristem. Plants were grown in soil pots covered with window screen to keep the dirt in place. Plants were imaged after bolting at least 5 cm. On the first day, at least 4 hours before imaging, the overlying flowers were removed from the inflorescence to expose the flower of interest (generally in stage 2 or 3). The older flowers on the opposite side of the meristem were removed so that the stem would lie flat against a slide. Flowers up to stage 12 were retained on the lateral sides of the inflorescence, which was important for the viability of the plant. The stem of the plant was taped to a slide such that the flower was exposed for the microscope. While the plant was in the growth room, the slide was taped to a small stake to keep the plant growing upright.

S6
The plants were imaged on a Zeiss 510 Meta confocal laser-scanning microscope every 6 (or 12) hours. About half an hour before imaging, the plant was tipped on its side, the inflorescence was immersed in 0.02% silwet 50 μg/ml Propidium Iodide (PI, Sigma P4170-10MG), and mounted with a cover slip. The plant was placed sideways on the microscope such that the slide attached to the inflorescence was in the stage slide holder. The plants were imaged using either a 40X W-Achroplan water-dipping 0.8 NA objective, a 20X Plan Neofluar 0.5 NA objective, or a 10X Plan Neofluar 0.3 NA objective as dictated by the size of the sepal. The 40X objective was mounted in a drop of water above the cover slip. The laser power was set to 50% and the transmission to less than 10%. The transmission was set higher for non-time lapse images. YFP and mCitrine were excited at 514 nm wavelength of an argon ion laser using a HFT 458 / 514 primary dichroic and detected using a NFT 545 secondary dichroic to split the beam and 530-550 band pass filter. PI was simultaneously excited at the 514 nm wavelength and detected using the beam passing through the NFT 545 secondary dichroic and a 585-615 band pass filter. GFP was excited using a 488 nm wavelength from the argon laser and a 488 primary dichroic and detected with a 545 secondary dichroic and a 505-550 band pass filter. A 512 X 512 pixel frame was scanned at 1 μm intervals in the Z dimension. Sufficient Z sections were taken to visualize the whole sepal, which changed as the sepal grew. Additionally the zoom and objectives were adjusted during the experiment as the sepal grew such that the whole sepal could be observed. Any sepals that stopped growing during the live imaging sequence were excluded from analysis. After imaging was completed the inflorescences were dried with a kimwipe and returned to a vertical position in the growth room. All plants were grown in continuous light.
Projections of the images were made using the Zeiss LSM software ( Figure S4A-B). The confocal image stacks were processed with the Amira 4.1 software (Visage Imaging, Mercury S7 Computer Systems). The confocal stacks were visualized using the 3D volume rendering function Voltex. Additional flowers were cropped from the image using the volume edit function, which was necessary for registration. The sepals were registered using a combination of hand alignment and the rigid Affine Registration function in Amira ( Figure S4C-D). A specific angle and magnification was chosen such that the whole time-lapse sequence could be best observed. Then a single snapshot image was taken of the volume rendered sepal at each  nuclei was determined by examining consecutive frames of a movie. To take into account the growth of the sample, the dots for the cells that had not divided were adjusted such that they overlie the new nucleus. After a division, two nuclei were present in the place of the original cells. The colored lineage dot was duplicated and each daughter received the same colored dot. In addition the daughter nuclei immediately following a division were outlined in white to highlight the division.

Segmentation of Sepals
To measure the areas of mature sepals, images were taken of mature (stage 14) sepals stained overnight in toluidine blue and photographed individually with a Zeiss Stemi SV 11 Apo S8 dissecting microscope and a Zeiss AxioCam HRC digital camera.
We have segmented whole sepals using ACTIWE, our C++ implementation of the Active Contours Without Edges level set based model [11]. Sepals were strongly stained producing almost completely homogeneous colors and a high contrast between background and foreground colors. This helped ACTIWE escape local minima and converge to a robust solution in a few iterations. Careful parameter tuning was necessary in a few cases of non-stained sepals to prevent over segmenting. ACTIWE automatically segments by solving the Euler-Lagrange partial differential equation corresponding to the energy model presented in [11]. We solve this geometric evolution PDE numerically using a stabilized method, allowing for easy control of the contour evolution.
The resulting average sepal areas were compared with a standard z-statistic for difference between means. For lgo-1 versus wild type, z=2.56 and p<0.05. For wild type versus pATML1::KRP1, z=7.35 and p<0.001.

Segmentation of Sepal Cells
To measure cell areas, we used the epidermal specific plasma membrane marker pAR169 pATML1::mCitrine-RCI2A. Whole mature stage 12 sepals expressing the marker were imaged with the 20X objective and settings described above in the live imaging section. Multiple 20X 3D image stacks were taken to cover the whole sepal and projections of the stacks (Zeiss LSM software) were merged in Adobe Photoshop to produce a complete mosaic image of the sepal.
Cells in sepals were segmented using a semi-automatic image processing pipeline. We started by S9 denoising the gray level confocal images of the fluorescently labeled plasma membranes using a fast nonlocal means based filter [12]. Noise was significantly reduced throughout the entire images after a few passes (typically five) of the filter. The good quality of cell walls after denoising suggested the adoption of an edge detection scheme to automatically trace cell boundaries. We thus computed weighted first derivatives of the denoised images and, after thresholding them, we applied a set of mathematical morphology operators (hole filling, thinning, cleaning, and spurs removal) [13] to create one pixel wide contours delineating all cells in the sepals.
Limitations in image quality and in our algorithms required supervision to help solve erroneous segmentations. Manual editing was consistently done in binary images representing thick walls produced after morphological operations and not on the original confocal images. With minimum editing we were able to generate high quality segmentations not obtained with other tools and in a fraction of the time required by standard approaches. We validated the segmentation by visually inspecting the superposition of the segmented boundaries on the walls in the original images. Once segmented, cell sizes (e.g. area, perimeter, diameter, etc.) were computed. Except for the filtering, all processing steps were done using Matlab.

Counting Nuclei
To count cells in the sepal epidermis we imaged the equivalent epidermal specific nuclear markers pAR89 pATML1::H2B-mYFP for the abaxial side and pAR180 pATML1::H2B-mGFP for the adaxial side. Two independent programs were used to count the nuclei in sepals to provide independent validation. Method 2 by Tigran Bacarian is used in [14]. Method 1 is S10 described below. Both programs produced false negatives by missing nuclei and false positives by counting background as additional nuclei. Both programs produced similar results.
For method 1, we have used mathematical morphology-based approach for segmentation of cell nuclei. The algorithm work-flow consist the following procedures: a) image pre-processing: where the 2D input image I (see Figure I1a) is filtered by morphological closing and opening with disk structuring element of size 3 pixels [15], b) image segmentation: where the filtered image I f is thresholded at automatically selected grey level [16], c) image post-processing: where the segmented image I t is filtered by morphological closing by reconstruction with disk structuring element of size 3 pixels [15].
The resulting segmentation is presented in Figure I1.

Population model
Consider the conceptual model in Figure M1. A 2C cell makes the random decision with probability p 1 to endoreduplicate to 4C or with probability 1-p 1 to divide and remain 2C. Once a cell has decided to endoreduplicate, it continues to do so and cannot resume mitotic divisions [17]. In the next cell cycle, all of the 4C cells endoreduplicate to 8C. Simultaneously, the 2C cells commit to endoreduplicate to 4C with probability p 2 and the remaining cells divide. In the final patterning cell cycle, those cells that decided to endoreduplicate earliest become the 16C giant cells, while those cells that continue to divide become small cells. The specialized division patterns of stomatal development follow at the end of this process and 2C cells continue to divide with probability p s . S13 Figure where p 1 ,p 2 ,p 3 are the probabilities to endoreduplicate for the first three cell cycles, and p s is the probability for a stomatal division, which occurs at the last stage. The total number of cells is given by, where We determined that by counting the cells in the abaxial sepal epidermis using nuclear segmentation ( Figure S2J). We also have found that 29% of the total cells are stomatal  Figure S2J). The measured mean ploidy distribution is 16C=1%, 8C=6%, 4C=32%, 2C=61% ( Figures 1D and 4I). We explored the parameter space because the ratio of 2C to 4C cells on the back of the sepal is unknown. The first S15 question that arises is whether the cells from the back of the sepal can all be only 2C cells.

S16
In Figure M2A, we plot p 2 ,p 3 as a function of p 1 when all cells on the back of the sepal are 2C (N 4cb = 0), which clearly shows that p 1 must be restricted to less than 0. , by fitting the ploidy distribution. We first use a standard implementation of a genetic algorithm (GA) [18] to provide a best beginning guess for the parameters, and then use a local optimization algorithm to obtain the best fit. We start the GA with a random population of individuals (100), and perform the following steps: Tournament Selection: 10 members are randomly selected from the population of 100, and ranked according to their fitness. The best member always gets selected, and the remaining members are selected based upon roulette selection. At the end of the selection step, 2 parents are retained. Crossover: The selected parents are crossed over [19], using an arithmetic mean λ=0.7, to give two children. Then a selection of two members is made from the parents and children, by first choosing the best amongst all of them and then using roulette selection among the remaining 3. Mutation of one parameter is randomly selected and mutated. For the local optimization method, we used the built in MATLAB implementation of non-linear least squares optimization routine (lsqnonlin).

S17
We find by running the search for 1000 times a parameter set with means, .
The above equation implies that . Taking the limiting case, we get, , for p 3 =0. Now the total number of 2C cells is 61% which corresponds to 2684, hence from Equation 1, using the expression for the number of 2C cells in the front of the sepal, and accounting for the stomatal cells, from , we obtain,

Intercalary growth model (IGM)
The IGM model follows from the assumptions given below.
(1) First we assumed that all cells grow such that distortions in the sheet of cells [20] are prevented. Cell growth is modeled such that cells do not slip with respect to each other. This can be simply implemented if it is assured that the sum of the growth rates of two recently divided daughter cells, which previously shared a common cell wall with its neighbor, is equal to the S20 growth rate of the neighbor. Cells grow anisotropically such that the vertical growth rate is greater than the horizontal growth rate.
(2) Second, we include variability in the duration of the cell cycle as observed in the live imaging data (see next section).
(3) Third, we assumed that the decision to endoreduplicate was random with a probability that depended on the cell cycle. The probabilities used were determined from the ploidy data using the population model (see previous section).
(4) Fourth, we assumed that once a cell enters endoreduplication it can no longer divide and continues to endocycle [17].
(5) Fifth, we assumed that the total number of endocycles undergone by the most highly endoreduplicated giant cell represents the maximum number of patterning cycles that the tissue undergoes regardless of whether those cycles are mitotic cycles or endocycles. Thus, we limited the number of cell cycles to 3 corresponding with the three endocycles required to reach 16C. At the end of these divisions, remaining 2C cells enter the stomatal pathway with probability p s ( Figure M1). These cells would continue to undergo a minimum of one more division, which is not modeled.
(6) Sixth, without noise in the cell cycle time, we assume that a cell would double its area in each cell cycle whether mitotic or endocycle. Upon the addition of noise as sampled from distributions derived from live imaging, cells no longer grow to exactly twice their area.
(7) Seventh, we assumed that cell division is not precisely symmetric, with daughter cells randomly partitioned with some noise as determined from imaging data (see next section). First, in live imaging, we observe that cells do not divide into two equal halves ( Figure M4). We find that the standard deviation in daughter cell areas of non-stomatal division is about 10%.
Therefore, we divide cells into two halves ± 10%. The upper daughter is 58% of the total cell area and the bottom cell is 42%.

S22
(ii) Second, from the live imaging we observe that cell cycles are asynchronous (Supplemental Movies S1-S3). We quantified the cell cycle times of individual cells in the wild type live images (as the intervals between one division and the next, or the start of imaging and the first division) ( Figure 3I). We have fitted this cell cycle time distribution with a Weibull distribution, described by The fitted distribution is shown in Figure 3I for this case, we used the two tailed t-test, which rejected the null hypothesis of equal mean values with a p-value=0.0067.
We now use the flowchart ( Figure M5) to describe the model. Figure M5: Flow chart for the Intercalary Growth model.

Starting Template
To determine the starting template for the model, we imaged the emergence of the sepal from the flank of the floral meristem. We found that the sepal emerges from a set of cells about 8 cells wide ( Figure 3A-B). This is consistent with sectoring data that showed the sepal emerges from a line of 8 cells in the floral meristem [21]. Therefore, the starting template is assumed to be a file of 8 cells, which we call the generative layer ( Figure 3G).

S24
We assume that this layer of cells proliferates and gives rise to cells, each of which will then go into the endoreduplication program: • If the division of a generative cell is vertical, the upper daughter is competent to enter the endoreduplication patterning program, while the lower daughter maintains generative identity.
• If the division of a generative cell is horizontal, both cells remain in the generative layer and continue to proliferate.
• The cell, which is ready to enter the endoreduplication program, now undergoes repeated divisions or endocycles as discussed earlier.

Single Cell Growth
Each cell's length and width are made to grow exponentially in time (with the exponential growth rate parameters, vertical λ =1 ; horizontal λ = 0.35], at rates that are parameterized to give around ∼125 generative layer cells at the end of the simulation time, which is chosen to give about ∼1400 cells (reduced from ~1600 to account for the additional stomatal cells which will finally populate the sepal, not shown in the model). The neighboring cells grow at the same rate, to prevent cells from slipping past each other. Cells with longer cell cycles reach larger sizes before dividing than cells with shorter cell cycles ( Figure 2C-D and S1). Therefore, cell cycle times are directly related to the area the cell reaches by the end of the cell cycle. In this model we assume that on average the cell divides when it reaches roughly twice its starting area, which is also when the next cell cycle begins. Since cell divisions are observed to occur asynchronously ( Figure 2C and Movies S1-3), this naturally translates into cells choosing different areas to which to grow, which are sampled from the observed distribution in cell cycle time. We . We now make a simple assumption, namely that endocycles behave in a similar way to cell cycles, i.e., endocycle times are sampled from the continuation of the cell cycle. Hence the 8C cell cycle time would be sample from the distribution, . A similar approach is taken for the lgo-1 and pATML::KRP1 cell cycle distributions. In Figure M6 we plot the distributions from which we sample for the cell cycle and endocycle times.

Validation of Model to in vivo data
In this section we analyze the distribution of cell areas obtained from confocal imaging data, ( Figure 3J and 6A) and compare it to the area distributions obtained through simulation ( Figure   3K and 6B). The data from the simulation was obtained by Monte Carlo simulations (we run the model 10 times), and recording the areas. The data in both cases has first been normalized to unit mean, and binned in a logarithmic scale (base 2).

Figure M7: Comparison of wild type in vivo data with in silico simulations
Histogram of cell areas for the experimental data, as well as from the simulation.
When we include extra stomatal divisions., the in silico cell size distribution with stomata is not significantly different from the wild type in vivo distribution. Figure M7 (blue line) outlines the histogram of cell areas (with unit mean) obtained from the imaging data. We note that the distribution is fairly broad, as compared to the distribution obtained from the simulation (red line), which seems to be underestimated for the 2C cell areas.
In Figure M8 we display the empirical cumulative distribution function, of both the experimental S29 as well as the simulation, which looks reasonably similar. To statistically compare whether the distributions are similar we ran the Kolmogorov-Smirnov (K-S) test, which gives a p-value

S30
This begs the question as to why is the experimental data is distributed more broadly than data obtained from the simulation? To understand the discrepancy of why the experimental distribution is higher at lower values of area, we ran the model and assumed stomatal divisions occur, as the last stage of sepal development. After every simulation of the sepal, we select 2C cells with probability p s =0.4416, and divide them into two stomatal lineage cells to generate approximately 29% stomatal cells as observed. Thus the area distribution of 2C cells gets spread out towards lower values.
In Figure M7, the distribution of the simulation, which takes into account the stomatal divisions is shown (black dashed line). The distribution is broader at low values of area, as compared to a simulation where stomatal divisions are not included. We re-ran the K-S test between the data and the simulation taking into account the stomatal divisions, with the p-value 0.0774 and KSstat=0.0242, which does not reject the hypothesis that the two distributions are similar. Hence the chief determinant in the variation in the histogram of areas, is due to the stomatal divisions which increases the left tail. This can also be seen in the comparison of the empirical cumulative distribution functions between the data and the simulation including the stomatal divisions, which is more similar ( Figure M8).
We also compared the cell size distribution of the model for pATML1::KRP1 with the in vivo data. One difference between the model and the data is that the data has an extended tail of large cells ( Figure 6A to the right on the graph). We tested whether altering the parameters of the model could produce this tail. We compared the distributions of cell areas for pATML::KRP1, obtained though simulation with the data for two cases: S31 (i) the pATML::KRP1 parameters are as obtained from the measured cell cycle data.
(ii) the cell cycle length for 2C cells is noisier as well as slightly increased from the measured value and the length of the endocycles for16C cells is noisier as well as slightly increased.
In Figure M9, the cell area distributions for the data (blue curve) and simulation including stomatal divisions (red curve) differ significantly at both lower as well as higher areas. We now repeat the simulation taking into account stomatal divisions but with two assumptions. (1) 2C cell cycles are noisier and also slightly larger, (2) 16C endocycles are also noisier and considerably larger. The black curve is the result of including these assumptions into the model.
Increasing the length of the 16C endocycle (black curve, right hand side green arrow) has the effect of increasing the tail of the distribution towards higher areas, and making it noisier, which has the obvious effect of spreading out the tail. This behavior better matches the in vivo data suggesting that overexpression of KRP1 also increases endocycle length. On the other hand making 2C cell cycles noisier, spreads out the left side of the distribution to include even smaller areas. Hence the final distribution shifts towards the left and also has a small tail; these features are shared with the observed distribution.

Difference between the IGM and the population model
Three important differences distinguish the IG model from the population model. First, the population model only addresses endoreduplication and does not explicitly describe cell size at all. As we show in Figure 3H-K, endoreduplication alone is insufficient to produce the cell size distribution. The cell cycle time is critical for generating the cell size pattern and this is only taken into account in the IG model. Second, the IG model addresses the spatial constraints on S33 growth posed by cell walls that prevent one cell from slipping relative to another. Third, the IG model describes the development of the whole sepal. If the population model is applied to an expanding template, it can only generate a region of the sepal and is insufficient to generate the whole sepal given the experimental data showing the starting number of cells in the sepal primordium.
All simulations were done in MATLAB (The MathWorks, Inc.)