Mouse embryonic stem cells can differentiate via multiple paths to the same state

In embryonic development, cells differentiate through stereotypical sequences of intermediate states to generate particular mature fates. By contrast, driving differentiation by ectopically expressing terminal transcription factors (direct programming) can generate similar fates by alternative routes. How differentiation in direct programming relates to embryonic differentiation is unclear. We applied single-cell RNA sequencing to compare two motor neuron differentiation protocols: a standard protocol approximating the embryonic lineage, and a direct programming method. Both initially undergo similar early neural commitment. Later, the direct programming path diverges into a novel transitional state rather than following the expected embryonic spinal intermediates. The novel state in direct programming has specific and uncharacteristic gene expression. It forms a loop in gene expression space that converges separately onto the same final motor neuron state as the standard path. Despite their different developmental histories, motor neurons from both protocols structurally, functionally, and transcriptionally resemble motor neurons isolated from embryos.


Sample-size estimation
• You should state whether an appropriate sample size was computed when the study was being designed • You should state the statistical method of sample size computation and any required assumptions • If no explicit power analysis was used, you should describe how you decided what sample (replicate) size (number) to use Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission:

Replicates
• You should report how often each experiment was performed • You should include a definition of biological versus technical replication • The data obtained should be provided and sufficient information should be provided to indicate the number of independent biological and/or technical replicates • If you encountered any outliers, you should describe how these were handled • Criteria for exclusion/inclusion of data should be clearly stated • High-throughput sequence data should be uploaded before submission, with a private link for reviewers provided (these are available from both GEO and ArrayExpress) Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission:

Statistical reporting
• Statistical analysis methods should be described and justified • Raw data should be presented in figures whenever informative to do so (typically when N per group is less than 10) • For each experiment, you should identify the statistical tests used, exact values of N, definitions of center, methods of multiple test correction, and dispersion and precision measures (e.g., mean, median, SD, SEM, confidence intervals; and, for the major substantive results, a measure of effect size (e.g., Pearson's r, Cohen's d) • Report exact p-values wherever possible alongside the summary statistics and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.
Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission: (For large datasets, or papers with a very large number of statistical tests, you may upload a single table file with tests, Ns, etc., with reference to sections in the manuscript.)

Group allocation
• Indicate how samples were allocated into experimental groups (in the case of clinical studies, please specify allocation to treatment method); if randomization was used, please also state if restricted randomization was applied • Indicate if masking was used during group allocation, data collection and/or data analysis Please outline where this information can be found within the submission (e.g., sections or figure legends), or explain why this information doesn't apply to your submission: Additional data files ("source data") • We encourage you to upload relevant additional data files, such as numerical data that are represented as a graph in a figure, or as a summary table TRex was run 5 times for videos 7-16 with visual identification, once for each speed measurement using all videos (with and without posture for each of the different tracking modes 'tree-based', 'approximate' and 'hungarian'). Conversion was performed 3 times for all videos to measure conversion speed when tracking at the same time, and also 3 times with live-tracking disabled. idtracker.ai was run 3 times for Videos 7-16, except Videos 9 and 10 which we were unable to track using idtracker.ai despite considerable effort (see main text for details). We did not exclude any results, all trials were run consecutively on the same computer, using the same script (with the exception of Fig.5, which is unrelated and was produced separately).
Necessary information can be found in Figure  RAW videos were selected to provide a wide variety of group sizes, organisms, and camera models. Videos used for similarity measurements (between idtracker.ai and TRex) were selected based on whether we could configure idtracker.ai to track them successfully. TRex' visual identification was used on all videos with <= 100 individuals.