Human pluripotent stem cell fate trajectories toward lung and hepatocyte progenitors

Summary In this study, we interrogate molecular mechanisms underlying the specification of lung progenitors from human pluripotent stem cells (hPSCs). We employ single-cell RNA-sequencing with high temporal precision, alongside an optimized differentiation protocol, to elucidate the transcriptional hierarchy of lung specification to chart the associated single-cell trajectories. Our findings indicate that Sonic hedgehog, TGF-β, and Notch activation are essential within an ISL1/NKX2-1 trajectory, leading to the emergence of lung progenitors during the foregut endoderm phase. Additionally, the induction of HHEX delineates an alternate trajectory at the early definitive endoderm stage, preceding the lung pathway and giving rise to a significant hepatoblast population. Intriguingly, neither KDR+ nor mesendoderm progenitors manifest as intermediate stages in the lung and hepatic lineage development. Our multistep model offers insights into lung organogenesis and provides a foundation for in-depth study of early human lung development and modeling using hPSCs.


INTRODUCTION
A primary goal of stem cell biology is to understand how the intricate regulation of cellular differentiation by cascades of gene regulatory networks and signaling pathways gives rise to the structure and function of organs.This knowledge is fundamental to human pluripotent stem cell (hPSC)-based approaches that model human development and congenital pathologies in vitro, and to devise regenerative therapies.The elucidation of molecular mechanisms governing foregut endoderm (FE) formation is of particular interest for biomedical research, as the FE gives rise to several organs, including the lung and the liver, that are relevant for disease modeling, drug screening assays, and regenerative applications.Various definitive endoderm (DE) progenitors, and the genes and pathways that regulate their specification, need to be better characterized in order to effectively control the differentiation of cells that represent endodermal tissues in vitro, such as epithelial cells of the airways and hepatocytes.
2][3][4] Single cell RNA sequencing (scRNA-seq) when applied to differentiation protocols of hPSCs can thus used to understand aspects of human development from in vitro studies.For instance, the molecular ''roadmap'' governing the development of functional human islet beta cells has been characterized by applying scRNA-seq time-series experiments. 5While embryological studies in mice have been instrumental in revealing the major stages and genes involved in FE development, insight into the early steps of human lung specification is largely pending.Recent studies have taken important steps in this direction by characterizing signaling pathways that regulate the differentiation of hPSC-derived NKX2-1+ progenitors into alveolar epithelial type 2 (AT2) cells. 6][10] Nevertheless, the intermediate states and mechanisms prompting the commitment of NKX2-1+ lung progenitors during endoderm differentiation have not been characterized in detail.In this regard, it is not well understood which mechanisms (B) Quantification of the number of eGFP+ cells by flow cytometry on day 15 of differentiation; conditions with and w/o combined FGF10+SHH treatment in defined basal media DMEM/F12 and IMDM (FE-BM1 and BM2, respectively).SD stands for SB431542 and Dorsomorphin, while SSD stands for SD plus SHH lie upstream of NKX2-1, and why mixtures of cells that express NKX2-1 and hepatic markers such as FGB are being produced by the current state of the art differentiation protocols. 10urrent protocols for lung progenitor differentiation from hPSCs utilize a stepwise application of signaling cues that are known to be important in lung development.Treatment of hPSCs with Activin/Nodal in conjunction with activation of Wnt/b-catenin signaling promotes the differentiation of DE, followed by transient inhibition of endogenous Activin/Nodal and BMP signaling by ''dual SMAD inhibition'', which promotes further commitment to the FE. 11In the last step Wnt/b-catenin is reactivated in conjunction with RA, BMP4, and FGF10 treatment, resulting in the formation of NKX2-1+ progenitors. 11,12][15][16] Differentiation protocols of hepatic lineages and lung progenitors are similar in the initial activation of the Activin/Nodal pathway to specify the DE.However, contrary to lung differentiation protocols, hepatic induction involves treatment with FGF2 or FGF2+BMP4 in the DE phase. 17In the second stage, some protocols utilize SMAD inhibition similarly to lung differentiation protocols, 11 while others utilize HGF. 18The disparities in the early induction of DE raise questions about the origin and identity of the hepatic cells that are neighboring the NKX2-1+ progenitors in lung differentiation protocols.Moreover, in this regard it is not resolved whether hepatic KDR+ progenitors 19 or mesendoderm precursors 20 are transient states in lung progenitor differentiation protocols.
The stalk and epithelial tips of human fetal lungs have recently been characterized by RNA sequencing, 21 and a cell atlas for the adult human lung has been created by scRNA-seq. 22,23How NKX2-1+ progenitors are being formed during development remains an open question.Using time-series scRNA-seq, we model the hierarchy of gene expression changes along a trajectory from hPSCs to NKX2-1+ lung progenitors.Our model illustrates how timed inputs of Activin A/Nodal, Wnt/b-catenin, Hedgehog, and TGF-b pathways promote lung progenitor differentiation.We have discovered a novel early role for Notch signaling in this process.Surprisingly, we also found that the emergence of hepatoblasts can take place directly from early DE, prior to the formation of FE, when Activin/Nodal and Wnt/b-catenin signaling are at their peak. 24We have characterized molecular details of this branching event that specifies lung progenitors versus hepatoblast identity.The transcriptional roadmap established here thereby provides an important resource for understanding the origin and abnormal development of the human respiratory system and hepatic lineages.

Parallel differentiaiton of human PSCs to lung progenitors and early liver cells
In the mouse embryo, the formation of the primitive streak (PS) marks the stage where pluripotent cells commit to becoming precursors of the germ layers including the DE.6][27][28] The TFs that subsequently promote the anteriorization of the primitive gut tube leading to the induction of NKX2-1, are less characterized both in the mouse model and the in vitro differentiation of hPSCs.][31] To monitor the appearance of lung progenitors we used a human induced PSC (hiPSC) line integrated with the eGFP gene downstream of the endogenous promoter of NKX2-1 (NKX2-1 eGFP/+ ). 32Our protocol assessment showed that SOX17, FOXA2, and cell surface markers that are associated with DE, namely, CXCR4, CKIT, EPCAM were markedly up-regulated at day 6 of the differentiation protocol (Figures S1A and  S1B).The formation of eGFP+ progenitors became apparent between days 13-15, and their number significantly increased by supplementation of SHH and FGF10, which also improved the overall cell survival (Figure S1C).We compared two types of basal media, namely, DMEM/ F12 and IMDM-based (BM1 and BM2 as described in the experimental procedures), showing 100% success rate with IMDM and an average efficiency of approximately 35% NKX2-1+ cells located in regions of eGFP expression (Figures 1B, 1C, S1C, and S1D).Since NKX2-1 is expressed in the fetal lung, as well as in the thyroid and forebrain progenitors, 7 we analyzed the expression of PAX8 and PAX6 which are (F) A heatmap displaying the expression of developmental markers and genes important for the function of the lung based on the bulk RNA-Seq analysis of the indicated conditions (p value <0.05, scale displays normalized log2 expression).(G) Gene Ontology (GO) term enrichment analysis of differentially expressed genes in eGFP+ and DCICS spheroid-organoids relative to undifferentiated NKX2-1 eGFP/+ cells (p value <0.05, GO term FDR <0.05).(H and I) A volcano plot showing differentially expressed genes comparing the eGFP+ and eGFP-sorted populations on day 15 (H), and the corresponding GO terms of the eGFP-population (I).(J) Immunostaining for AFP (liver marker), imaging of eGFP+ cells (lung progenitors), and DAPI staining on day 15 of differentiation (scale bars: 100 mm).DEdefinitive endoderm; FE -foregut endoderm; LP -lung progenitors; LOs -lung organoids.
representative markers of the latter tissues.The low expression levels of these markers on day 15 indicated that the NKX2-1 signal represented the formation of lung progenitors but not the other tissues (Figure S1E).
Next, we analyzed the developmental potential of putative eGFP+ lung progenitors using spheroid 3D culture assays.We embedded clusters of eGFP+ cells from day 15 of differentiation in Matrigel and supplemented the media with CHIR99021 (CHIR), FGF7, and FGF10 (Figure 1A), to promote the proliferation of lung progenitors in suspension culture. 10This approach led to the outgrowth of spherical structures, which tripled in size within 7 days and maintained the expression of eGFP (Figures 1D, S1F, and S1G).Subsequent treatment with dexamethasone, cAMP and IBMX (DCI), known to promote the maturation of the fetal lung, 33,34 induced the expression of genes and proteins associated with club cells and goblet cells in the proximal region of the lung, namely, SCGB1A1 and MUC5AC (Figures 1E and S1F-S1H).When Wnt/b-catenin pathway was activated by CHIR, the spheroids developed branches (DCIC), and further, when combined with inhibition of TGF-b by SB431542 (DCICS), the spheroids grew substantially larger to an average size of 1.6 mm by day 35.They also exhibited branches and markers of alveolar epithelial type II cells, namely, SFTPC and SFTPB, and diminished expression of SCGB1A1 and MUC5AC compared to DCI (Figures 1E and S1F-1H).
To characterize the developmental stage, the differentiation potential of NKX2-1+ progenitors, and the basis of the heterogeneity in eGFP expression, we sequenced the mRNAs of undifferentiated cells, sorted eGFP+ and eGFP-populations and DCICS organoids (Figures S1I and  S1J).Genes that have been implicated in the formation of respiratory epithelial cells in the lung, namely, FOXA2, FOXA1, FOXP1, and NKX2-1, 35,36 were highly expressed in the eGFP+ cells and DCICS organoids (Figure 1F).GO term analysis furthermore revealed enrichment of genes involved in embryonic respiratory lung morphogenesis and alveolar development in eGFP+ progenitors and DCICS organoids (Figure 1G).The organoids additionally exhibited markers of early and late branching morphogenesis and differentiation of the distal lung, including RUNX1, MUC1, SFTPC, SFTPB, CLDN18, and NAPSA (Figure 1F), and their expression pattern closely resembled that of published transcriptomes of the alveolar tips of human fetal lungs, but was less similar to the stalk of fetal lungs (Figure S1K).
Analysis of eGFP-cells revealed that the eGFP-NKX2-1-population expressed fetal liver genes, including apolipoproteins (e.g., APOA1 and APOB) fibrinogen (FGB), the plasma protein alpha fetoprotein (AFP), 17,18 and GO term analysis revealed the enrichment of fetal liver development, regeneration and metabolism categories (Figures 1H and 1I).To confirm the coexistence of lung progenitors and hepatocytes on day 15 of the differentiation protocol, cultures were immunostained by AFP and eGFP antibodies.This showed mutually exclusive expression of eGFP and AFP, and the presence of lung progenitors and hepatoblasts in separate clusters (Figure 1J).Taken together, we concluded that co-induction of lung and liver fates took place during the differentiation of lung progenitors from hPSCs, raising the question of the mechanisms that drive the mutually exclusive specification of these lineages from FE.

Time-resolved single cell RNA-seq of hPSC differentiation
To chart a comprehensive map of the transcriptional states underlying the differentiation of lung progenitors and hepatoblasts (HB) in parallel, we performed a 16-day time-series scRNA-seq using the Drop-Seq workflow. 37Single cell suspensions were processed daily, resulting in the analysis of a total of 10,667 cells that passed quality standards and were used for downstream analysis (Figures 2A and S2A; Tables S2 and  S3).First, we used the Uniform Manifold Approximation and Projection (UMAP) and Partition-based Graph Abstraction (PAGA) analysis 38 to project the gene expression data onto two dimensions and to perform an assessment of the connectivity of the obtained cell clusters (Figures 2B, 2C, and S2A-2C).This revealed three major domains in the high dimensional gene expression manifold corresponding to the three stages of the differentiation protocol (Figures 2B, 2C, and S2A).
To model the dynamics of gene expression along the entire trajectory we used pseudotime inference (Figure 2D), which was in a good agreement with the time points of sampling (Figure S2D).Unsupervised hierarchical clustering of the genes along the pseudotime ordering revealed stage specific markers, including TFs expressed in undifferentiated PSCs e.g., POU5F1 (OCT4) and NANOG; DE factors including e.g., SOX17, MIXL1, EOMES, CER1, CXCR4, and LEFTY1; TFs characteristic for the FE e.g., FOXA2, and FOXP1; 36 and genes important for the formation of lung progenitors e.g., NKX2-1 and IRX3 emerging in the latest stage (Figures 2E, S2B, S2E, and S2F; Table S5). 7,39The pseudotime was consistent with the consecutive expression of DE, FE, and lung progenitor markers, e.g., NODAL, CER1, FOXA2, IRX3, and NKX2-1 during the time course (Figure 2F).Importantly, genes that are characteristic of hepatoblasts and hepatocytes, including GATA6, AFP, FGB, and APOB were markedly up-regulated, some already during the second stage of the protocol (days 7-10) as demonstrated by APOB (Figures 2E and 2F).These were apparent in Louvain clusters without expression of lung markers, namely clusters 0 and 12; while LPs seemed to be represented by clusters 1 and 10 (Figures S2E and S2F; Table S4).Finally, we performed gene category enrichment analysis to detect candidate pathways underlying the differentiation of lung progenitors.It showed an association between TGF-b/Smad and Wnt/b-catenin with the formation of DE, the Hippo pathway with FE differentiation, and Notch signaling in conjunction with insulin-like growth factor signaling with the NKX2-1+ lung progenitor cells (Figure 2G).Taken together, these results indicated that the specification of hepatoblasts originates at an earlier stage than FE formation and lung progenitors without requiring stimulation by exogenous FGF2, BMP4, or HGF. 17,18rly transcriptional regulators of lung and hepatoblast specification We next sought to identify genes and mechanisms that promote the emergence of lung progenitors, and the formation of liver cells at the DE stage, by focusing on the timing of up-regulation of known TFs.In this regard, Isl1 has recently been characterized as a key regulator of Nkx2-1 in the early FE, 40 and Hhex has been associated with the activation of Hnf4 TFs, which are essential for liver development. 41Moreover, the TFs Gata6, Foxa1, and Foxa2 are well known for their crucial roles in the formation of the liver. 42,43Analysis of scRNA-seq data of the first 6 days of differentiation revealed the up-regulation of GATA6, FOXA2, and HHEX, but not of ISL1 or NKX2-1 (Figures 3A and 3B).Interestingly, the single cell data showed that only very few cells expressed T (Brachyury) in the first 6 days of differentiation (Figure 3B).Because the transient expression of Brachyury, followed by Foxa2, is known to occur in a subpopulation of the DE called mesendoderm, 20 the absence of T indicated that the origin of the liver and lung cells is not associated with a mesendoderm origin.Collectively, this suggests that the regulation promoting the differentiation of liver cells is being established during the early specification of DE rather than upon FE formation.
To investigate further the circuitry of TFs that promotes DE specification, we used hierarchical clustering of all the genes that exhibited transcriptional changes during the first 6 days of differentiation.We generated six major clusters numbered A1-A6, representing different gene expression kinetics, and analyzed the enrichment of respective GO pathways in these clusters (Figures 3C and 3D; Table S6).Expression of major DE genes, such as EOMES, LHX1, OTX2, CXCR4, LEFTY1, and SOX17 followed a very similar pattern with gradual upregulation peaking at day 6 (cluster A6).This cluster was enriched in pathways involved in endoderm differentiation and the regulation of SMADs and TGF-b signaling (Figure 3C; Table S6), including the loop of EOMES-Activin/Nodal signaling (Figure S3A).Several studies that have previously analyzed bulk DE cells have suggested that a subset of KDR+ (VEGF receptor 2) progenitors generates hepatic cells from differentiated human ESCs and that KDR signaling is important for their specification. 44Contrarily, we have noted that the up-regulation of FOXA2, GATA4, and GATA6 coincided with a downward trend of KDR (Figure 3B and cluster A3 in d), suggesting that KDR is either not involved in instructing the early liver phenotype alongside lung cells or is only required in early differentiation phases.
To next investigate whether lung progenitors emerged during the second stage of the protocol, and whether this process coincided with maturation of hepatoblasts, we used hierarchical clustering to generate six clusters of gene expression, numbered B1-B6, for days 6-10 of differentiation (Figures 3E-3H; Table S6).We noted that the genes linked to the induction of DE, namely EOMES, LHX1, GSC, OTX2, SOX17, and CXCR4, 45 and GO pathways associated with meso-endoderm development were sharply down-regulated upon dual-Smad inhibition in this stage (cluster B1, Figures 3G and 3H).An important exception was FOXA2, whose expression is essential for development of the foregut.FOXA2 belonged to cluster B5 together with FOXP1 and PITX2 (Figure 3H), which are crucial for lung morphogenesis and asymmetry in the mouse, but not for the initial specification of the lung in the mouse. 36,46In accordance, GO pathways showed enrichment of lung morphogenesis, embryonic organ development and epithelial cell differentiation in cluster B5 (Figure 3G), even though at this stage NKX2-1 had not been up-regulated yet (Figure 3F).Nevertheless, approximately halfway through this stage, ISL1 was up-regulated in cluster B3 alongside IRX3.Isl1 has recently been shown to regulate the development of lung lobes and trachea-esophagus tube separation by the activation of Nkx2-1, 40 and Irx3 is necessary for lung formation by promoting the proliferation of branched epithelium. 39In accordance, the GO patterns of cluster B3 included epithelial cell proliferation and tube closure, which collectively indicated that genes/hallmarks involved potentially in the initiation of the lung progenitor program had started to be regulated already within the FE stage (Figure 3G).ISL1 is also known to regulate SHH in the FE, 47 which suggested that basal differentiation of NKX2-1-eGFP+ cells was supported by endogenous production of SHH (Figure 1B).The connection to the activation of the SHH pathway at days 6-10 was apparent in clusters with an upward trend, including GLI4 in B5 and GLI3 in B6 (Figure 3H).Furthermore, comparison of our data with scRNA-seq of the mouse embryo at the time of gastrulation, 2 showed that Isl1 and Irx3 were specifically enriched in the FE and the pharyngeal endoderm around E8.0-E8.5 (Figures S3B-S3E).In agreement with the FE association, pancreatic genes such as PDX1, which is associated with the midgut and suppressed by SHH, 48 were not apparent in our data (Figure S3B).
Lastly, we analyzed hepatoblast and hepatocyte genes in days 6-10 following the activation of HHEX in days 0-6.At this stage we noted the upregulation of pioneering liver TFs including, HNF1A, HNF1B, and TBX3 which governs the expansion of the liver bud, 49 as well as first indications of functional liver genes, namely, APOB, APOA2, and Transthyretin (TTR) which are secreted by the liver (Figures 3F and 3H).Importantly, we noted that the expression of these liver genes was associated with Louvain clusters 0 and 12, which were to a large extent mutually exclusive from the aforementioned lung factors and related Louvain cluster 10 and 1 (Figures 2C and S2F; Table S4).Moreover, we analyzed the correspondence with scRNAseq data of human fetal liver tissues. 50We observed a significant association with gene signatures from human primary fetal hepatocytes, including AFP, APOA1, F2, VNT, AGT, and APOB in this time frame (Figure S5).
Taken together, these scRNA-seq data indicate that the lung and liver lineages are already set in their respective separate trajectories during the second stage of the protocol.

Trajectory analysis reconstructs lung and liver branching
We next investigated the mechanisms that drive the separation of lung progenitors from the liver fate.Trajectory analysis of differentiation days 11-15, combined with days 7-10 revealed a branching event in the high dimensional single cell data manifold (Figures 4A, 4B, and S4).The lung progenitors appeared to become separated at the intersection of the FE stage with the last stage, and the maturity of lung and liver fates increased progressively over time when compared to NKX2-1eGFP+ and NKX2-1eGFP-bulk populations (Figures 4C and 4D).Importantly, the liver branch, but not the lung, exhibited an association with human fetal hepatocytes (Figure S5).As noted previously, the liver branch was established at days 7-10, preceding the appearance of a clear lung identity (Figures 4A and 4B).
The pseudotime trajectories in the last stage of the differentiation protocol revealed key differences in the lung and liver differentiation branches (Figures 4E and 4F).As expected, the expression of lineage specific markers such as NKX2-1, IRX3, and HNF1B, and FGB was exclusively increased along the two trajectories (Table S7).We noted that GLI3, a key transcriptional repressor that is regulated by SHH signaling, was upregulated only in the lung branch.This corresponded to the exogenous and endogenous activation of the pathway by SHH and ISL1 (Figures 3G and 3H).Moreover, key components of the Wnt/b-catenin pathway, including DKK1, WNT5A, SP5, and notably the pathway's canonical marker AXIN2, exhibited considerably higher expression in the lung trajectory (Figure 4F).The specificity of these pathways to the lung trajectory is particularly interesting given that exogenous treatment by SHH and CHIR (Figure 1A) did not activate the pathways in the neighboring hepatoblasts (Figure 1J).Considered together with earlier studies that revealed central roles for SHH and Wnt/b-catenin in foregut-early lung development, 30,51,52 these results indicate that early hepatocytes might be refractory to these pathways.In accordance, the exclusive expression of SOX2 in the lung trajectory is another indication of the lineage-specific activity of Wnt/b-catenin because SOX2, canonical Wnt signaling, and FGFs often intersect in the regulation of self-renewal in development (Figure 4F). 53ext, we found evidence for the activation of the Notch pathway specifically in the lung branch.This was deducted through the lung branch-specific expression of HES1, a canonical TF of the Notch pathway, as well as the Notch pathway modulator DLK1 that is known to be involved in lung branching and morphogenesis. 54To experimentally determine if Notch is needed for lung specification in this system, we inhibited the Notch pathway using the g-secretase inhibitor DAPT from day 11 onwards (Figures 4G-4I).In accordance with our hypothesis, this led to a significant decrease in the number of NKX2-1eGFP+ progenitors, a reduction in NKX2-1 expression, and a reciprocal increase of AFP expression.Interestingly, a key ligand of the Notch pathway, namely DLL1, was specifically expressed in the liver branch (Figure 4F), suggesting that hepatoblasts might promote Notch signaling in lung progenitors by paracrine signaling.Moreover, we inhibited the TGF-b pathway using SB431542 after noting that the expression of TGFB2 and THBS1 was restricted to the lung branch (Figure 4F).We detected a similar effect to Notch pathway inhibition, namely, a decline in the number of NKX2-1-eGFP+ progenitors and expression of NKX2-1 in treated cells (Figures 4G-4I).In summary, this suggests an important role for the interplay between Notch and TGF-b in the specification of the foregut toward lung progenitors.
Finally, the pseudotime trajectory ordering revealed the surprising involvement of key developmental genes that to our knowledge, have not been implicated in the specification of hepatoblasts and lung progenitors.6][57] Moreover, the expression of PROX1 which is considered an early and specific marker for the developing liver and pancreas in the mouse FE, 58 was in fact specifically expressed in the lung trajectory (Figure 4F).In addition, the expression of FOXP1 and PITX2 spanned both the lung and liver trajectories, and FOXP2 was not detected (Figure S4E), contrary to the specific expression in lung progenitors described in the mouse embryo. 36,46This highlights differences in the regulation of key developmental genes between mouse and human, which we incorporated in a comprehensive roadmap model for human lung and hepatoblast differentiation from the pluripotency state (Figure 4J).

DISCUSSION
Our data indicate a direct conversion of early DE to hepatoblasts.Mechanistic mouse studies show that the activation of the Activin/ Nodal-EOMES loop initiates on the high levels of canonical TFs that control hepatogenesis in the mouse embryo.This includes Gata6 and Gata4, 43 Foxa2, and Foxa1, 61 which have been implicated in pioneering chromatin remodeling in FE precursors, 62 and were utilized in the induction of hepatocyte transdifferentiation. 63,64 Moreover, Gata4 and Gata6 have been shown to be functionally redundant in promoting the development of the liver, and together with Foxa2 they promote liver specification by inducing Hhex. 65Remarkably, in our data a number of hepatoblast TFs, such as HNF4A, TBX3, and liver factors that are secreted into the plasma, such as APOA2, FGB, TTR, APOB were induced shortly after HHEX.Also, the gene signature that was activated in this stage in the liver branch showed the highest correspondence with fetal hepatocytes when compared to human liver development (Figure S4).Collectively, this supports our ''fast track'' liver differentiation model from early DE (Figure 4J).Interestingly, the expression of HHEX itself ceased almost entirely in the third phase of the protocol, despite the continued expression of FOXA2 and GATA6.It is possible that a majority of cells had undergone specification already at the beginning of the third phase of the protocol, or that the disappearance of HHEX is connected to the downregulation of OTX2, which directly regulates the HHEX promoter FE cells, 66 or that a different mechanism is in play.
Of special interest for tissue engineering and disease modeling is the association of the FOXA2 -ISL1 -NKX2-1 trajectory with developmental pathways that regulate patterning and self-renewal in the lung (Figure 4J).Our work highlights (i) the up-regulation of SOX2 in the lung branch that coincided with the activation of Notch and Wnt/b-catenin signaling, (ii) the potential involvement of the Hippo pathway during the foregut stage of the differentiation protocol, (iii) the reduction of NKX2-1 progenitors upon TGF-b and Notch inhibition, together with (iv) Yapdependent TGF-b-mediated induction of proliferation by Sox2 in mouse airway progenitors, 67 creating a putative picture of renewal mechanisms that might inform the future derivation of human lung stem cells.Accordingly, Ikonomou et al. 68 demonstrated the upregulation of Hippo and TGF-b activity prior to and upon early lung specification, respectively.Moreover, the specific expression of SOX17 and CDX2 in the liver trajectory is novel and important.In the mouse CDX2 expression is restricted to the hindgut where it serves to regulate intestinal stem cell development. 55Therefore, the up-regulation of CDX2 in conjunction with SOX17, which serves as a patterning factor in the liver bud, 56 might indicate that factors play central roles in the regulation of human liver development.Comparing the trajectories of additional foregutlung TFs with their associated functions in mouse development additional differences.we noted the broadening of the profiles of FOXP2 and PITX2, which might imply that evolutionary changes have increased their involvement in the differentiation of the human foregut.Another example is the association of PROX1 with lung progenitors as opposed to hepatocytes in the mouse. 44This reveals putative mechanisms that are specific to the formation of human precursors and their into lung and liver progenitors.
In summary, the high temporal resolution of our single cell trajectory analysis enabled us to construct a hierarchical model of gene expression changes along two trajectories from hPSC lung and liver fate, respectively.In our analysis the estimation aligned with the highly resolved time series of sampling.Interpreting our results in light of in vivo studies provided a detailed understanding of mechanisms, their timing, and evolutionary changes that regulate the specification of lung progenitors in the foregut.These findings should inform applications of iPSCs tissue engineering efforts for regenerative medicine of the lung, possibly including the derivation of lung stem cells.Moreover, our work offers in parallel a high resolution ''roadmap'' for a potential direct conversion of hepatoblasts, contributing to the improvement of the required knowledge for manufacturing of hepatocyte grafts in liver disease.

Limitations of the study
In this study, we used a high temporal resolution of sampling to delineate gene expression trajectories in the differentiation of hPSC toward lung progenitors.The resulting pseudo-temporal model of a lung and liver trajectory requires further dissection using functional perturbations to distinguish between correlation and causation.We have addressed the functional role of Notch and TGF-beta signaling as well as SHH in the formation of lung progenitors.Based on our data, we cannot conclude the mechanism by which SHH improves the differentiation efficiency of lung progenitor cells.A promising direction for future research would be to explore the autocrine function of SHH signaling in early lung development. 60Furthermore, a previous study has also demostrated the beneficial role of SHH during FE generation from PSCs. 694][15][16] In our system, FGF10 appears to facilitate the differentiation outcome in the absence of SHH despite its dispensable role.Hence, it would be relevant to further investigate this complex web of cytokine interactions in order to delineate mechanisms stemming from the (epi)genetic between PSC lines, which might influence their differentiation efficiencies.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include following:

EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS Stem cell maintenance and passaging
Human iPSC line NKX2-1 eGFP/+ was kindly provided from Hannover Medical School, 32 maintained in feeder-free conditions, in StemMACS iPS-Brew XF (Miltenyi Biotech) and passaged with Accutase (Sigma-Aldrich) on tissue cell culture plates pre-coated with 1:100 dilution of Geltrex Basement Membrane Matrix in DMEM/F-12 (both from ThermoFisher Scientific).
To induce the lung progenitor (LP) stage, on the day 10, the medium was switched to FE-BM1 or BM2 containing 20 ng/ml recombinant human BMP4 (rhBMP4, R&D Systems), 50 nM retinoic acid (RA; Sigma-Aldrich), 3 mM CHIR99021 and 20 ng/ml rhFGF10 (R&D Systems).For the inhibition of Notch and TGF-b pathways, 10 mM SB431542 and 100 mM DAPT (TOCRIS Bioscience) were additionally used at the LP induction stage of differentiation respectively.Growth media and supplements are listed on Table S1.

Flow cytometry analysis and sorting of iPSC-derived lung progenitors
The cells were rinsed with DPBS and incubated in Gentle Cell Dissociation medium (STEMCELL Technologies)/0,05%Trypsin (ThermoFisher Scientific) (2vol:1vol) for 10 min, at 37 C.The detached cells were diluted in a FACS buffer containing 2% FBS/DPBS and centrifuged at 200 RCF for 3 minutes, at room temperature (RT).Cell pellets were washed with FACS buffer and centrifuged again.For analysis, the samples were incubated (if necessary) with the conjugated antibodies diluted in FACS buffer, for 25 min on ice, washed twice with FACS buffer and analyzed.Live cells were distinguished by staining with propidium iodide (PI) or Sytox blue (SB).For sorting, the dissociated cells were resuspended in FACS buffer, stained with PI and sorted into 2% FBS/DMEM.The measurements were obtained using a BD FACSAria III flow cytometer (BD Biosciences).Isotype controls were used for gating stained cells, whereas undifferentiated cells that do not express eGFP were used as negative control for gating NKX2-1 eGFP+ cells.

Generation of 3D cultured lung organoids
On day 15 of the 2D differentiation, the cells were rinsed with DPBS and the eGFP+ colonies were dissected with a needle, picked and transferred to 80% Matrigel (diluted in FE-BM2).Drops of 40mL were pipetted into the center of each well of a 24 well tissue culture plate and incubated for 30 min at 37 C.Then, FE-BM2 media supplemented with 10 ng/ml FGF10, 10 ng/ml FGF7 and 3 mM CHIR99021 was added to each well for 7 days. 10On day 22 the medium was switched to FE-BM2 media supplemented with 50 nM dexamethasone (Sigma-Aldrich), 0.1 mM 3-Isobutyl-1-methylxanthine (IBMX; Sigma-Aldrich), 0.1 mM 8-Bromoadenosine 30,50-cyclic monophosphate sodium salt (cAMP; Sigma-Aldrich) 8 and G 10 mM SB431542, 3 mM CHIR99021.For all the 3D Matrigel culture conditions the medium was changed every second day.Growth media and supplements are listed on Table S1.

Immunostaining
For imaging 2D cultures cells were cultivated on removable 8-well chamber glass slides (Ibidi), rinsed twice with DPBS and fixed with 4% PFA (Sigma-Aldrich)/DPBS for 20 min at RT. Cells were permeabilized with 0.2% Triton X-100 (ThermoFisher Scientific) for 5 min and blocked with 3% BSA for 30 min at RT.Then, cells were incubated with primary antibodies diluted in solution containing 0.1% Triton X-100/3% BSA overnight at 4 C, followed by three washes with DPBS, after which they were incubated with secondary antibody in 0.1% Triton X-100/3% BSA solution for 1 hour at RT and washed three times with DPBS.The removable wells were discarded and ProLong TM Gold Antifade Mountant with DAPI (Thermo Fisher Scientific) was used to mount the coverslip.
For imaging lung organoids, the samples were extracted from the Matrigel drops, upon incubation with 2mg/ml Dispase IV (Sigma-Aldrich), at 37 C for 1h.Then, the organoids were fixed with 4% paraformaldehyde/DPBS for 2 hours at 4 C and incubated in 30% from downstream analysis.Guided by histograms of quality metrics, further filtering was carried out.Cells with a high number of UMI counts may represent doublets, thus only cells with less than 5000 UMIs were used in downstream analysis.Quality metrics for each sample can be inspected in Table S2.
After the initial filtering of cells, we followed the common preprocessing procedure.The expression matrices were normalized and scaled using Seurat's NormalizeData() and ScaleData() functions.In order to mitigate the effects of unwanted sources of cell-to-cell variation, cellcycle effects, percentage of mitochondrial reads and number of counts were regressed out CellCycleScoring() and ScaleData().The 600 top variable genes across the data set were selected via FindVariableGenes() using the default parameters.From this list the known cell cycle genes were excluded.These variable genes were then the basis of the principal component analysis.The first 50 Components were the input for Seurat's function FindClusters() at a resolution of 2, resulting in 13 clusters.To visualize the clustering result of the high dimensional singlecell data, the UMAP was generated using again 50 components as input for the Seurat function RunUMAP() with a number of neighbors set to 20.To clean up the data set further, a small number of cells disagreeing in the louvain cluster and UMAP embedding were removed.The final meta data and marker gene expression per cluster is reported in Tables S3 and S4, respectively.

Selection of genes with significant association over time
To account for the harsh perturbation in gene expression induced by the medium change, the consequent analyses were split into three stages corresponding to the definite endoderm DE (day 0 to day 6), foregut endoderm FE (day 7 to day 10) and lung progenitors LP (day 11 to day 15).The stage wise temporal analyses were executed using the python package Scanpy. 76he principal components and the neighborhood graph based on the first 10 components was recalculated for each stage separately using Scanpy's pp.pca() and pp.neighbors().After generating the diffusion maps via the tl.diffmap() function setting the root cells after manual inspection, the diffusion pseudotime was calculated with tl.dpt().The diffusion pseudotime for the whole data set was then set by adding the stage-wise pseudotimes and scaled to a value between 0 and 1.To assess global connectivity, the louvain clusters were used as input for Partition-based graph abstraction (PAGA) tl.paga().The weighted edges reflect a statistical measure of connectivity between the groups.Edges with weight < 0.05 are omitted.
For the identification of genes showing significantly changing expression patterns over time, the following approach was applied by the means of the R packages limma 74 and splines.As we were particularly interested in the expression pattern towards early lung progenitors, we selected only those cells that were positive for either eGFP or NKX2-1 from days 11 to 15 for the calculation and display.Furthermore, as we did not want to incorporate differences in ribosomal derived genes, such genes were excluded from this analysis.We employed a regression model in which splines are used to model non-linear effects of continuous variables to fit the time-course data.For each gene we fit a natural cubic spline with 4 knots while using the time point of extraction as an explanatory variable.UMI counts of each cell were included as a covariate in the model to account for differences in library size.
We rank the genes based on their adjusted p-values.As it is non-trivial to interpret p-values across time, we use the adjusted p-value as a ranking for the genes and visualize the top 200 genes in Figure 2E.The results are reported in Table S5.

Stage-wise hierarchical clustering of genes
The clustering analysis was done for the endoderm stage (day 0 to day 6) and the combined endoderm and FE stage (day 0 to day 10).For this, genes for both versions with interesting expression patterns were again selected as described previously using the regression model based on spline fits.The scaled expression levels of genes with an adjusted p-value of less than 0.005 were the input for hierarchical clustering using the hdist() from the R package stats.The dendrogram tree was cut into 10 clusters, which were manually re-annotated into 6 final groups for each stage based on their average expression per day.The cluster sizes can be different, thus we chose to display the average expression of the 100 genes with the lowest adjusted p-value per cluster in Figures 3D and 3H.For the second clustering analyses, we display the average expression and kinetic profiles for day 6 to day 10.The resulting clusters are reported in Table S6.

Potential branching into lung and non-lung progenitor cells
To identify genes associated with the potential differentiation branches, we used the signatures from the differential gene expression analysis between eGFP labeled cells versus eGFP negative cells in the initial bulk experiment.As variable genes those genes were selected, which showed a log fold change of above 1 and below -1.To reduce the effect of the media change after day 10, we removed genes that were differentially expressed between the FE and LP stage.In order to guide the dimensionality reduction these 1294 genes were used as input for the PCA.Following Scanpy's workflow, the knn graph was constructed via pp.neighbors(n_neighbors = 10, n_pcs = 50), and the UMAPs and diffusion maps were generated via tl.umap() and tl.diffmap(), respectively.In the next step we filtered for the two branches of interest, which started in the FE stage and had their endpoint in the LP stage.The cells were scored to their similarity to the bulk signature via tl.score(), to evaluate if the two potential cell fate trajectories correspond to the branches that we found.
To identify genes showing temporally altered expression patterns along these two potential branches, which are also significantly different across the branches, we employed the python package diffxpy (https://github.com/theislab/diffxpy).The starting population for the two subbranches (lung and hepatocyte progenitors) was chosen as all cells from FE stage for both trajectories.The cells from the DE stage were divided based on their louvain clusters after visual inspection of the diffusion map and known lineage marker gene expression.The diffusion pseudotime was calculated on these two trajectories separately.Diffxpy's test.continuous_1d()function runs the test with a spline basis to allow smooth trends.This function was employed on the combination of these two trajectories, using pseudotime as a continuous covariate

Figure 1 .
Figure 1.Differentiation of lung progenitors from hPSCs (A) A schematic illustration of the differentiation protocol used in this study to produce eGFP+ lung progenitor cells with spheroids and organoids the on right.(B) Quantification of the number of eGFP+ cells by flow cytometry on day 15 of differentiation; conditions with and w/o combined FGF10+SHH treatment in defined basal media DMEM/F12 and IMDM (FE-BM1 and BM2, respectively).SD stands for SB431542 and Dorsomorphin, while SSD stands for SD plus SHH

Figure 1 .
Figure 1.Continued for FE.Accordingly for LP stage induction, BRC stands for BMP4, RA and CHIR99021 while BRCF for BRC plus FGF10.Bars represent the mean +SD, n = 3 biological replicates, ***p < 0.001 conducted using an unpaired one-tailed t-test.(C and D) Fluorescence microscopy images showing day 15 NKX2-1eGFP+ cell sectors (C), and a representative day 22 spheroid produced by embedding a colony in Matrigel, along with the further growth of spheroids (Scale bar: 200 mm) observed on day 35 with treatments as indicated (D).(E) Immunofluorescence staining of day 35 DCI and DCICS spheroid-organoids for SFTPC, SFTPB, SCGB1A1, and MUC5AC (scale bar: 20mm).(F)A heatmap displaying the expression of developmental markers and genes important for the function of the lung based on the bulk RNA-Seq analysis of the indicated conditions (p value <0.05, scale displays normalized log2 expression).(G) Gene Ontology (GO) term enrichment analysis of differentially expressed genes in eGFP+ and DCICS spheroid-organoids relative to undifferentiated NKX2-1 eGFP/+ cells (p value <0.05, GO term FDR <0.05).(H and I) A volcano plot showing differentially expressed genes comparing the eGFP+ and eGFP-sorted populations on day 15 (H), and the corresponding GO terms of the eGFP-population (I).(J) Immunostaining for AFP (liver marker), imaging of eGFP+ cells (lung progenitors), and DAPI staining on day 15 of differentiation (scale bars: 100 mm).DEdefinitive endoderm; FE -foregut endoderm; LP -lung progenitors; LOs -lung organoids.

Figure 2 .Figure 3 .Figure 4 .
Figure 2. Time-resolved analysis of hPSC differentiation using scRNA-seq shows distinct hierarchy of gene expression changes (A) A schematic illustration of daily sampling for scRNA-seq during 15 days of the lung progenitor differentiation.(B) The time of sampling is color coded on the UMAP projection of scRNA-seq transcriptomes (HB: hepatoblasts).(C) The connectivity of distinct Louvain clusters as determined by graph abstraction (PAGA) is shown on the UMAP projection.(D) The UMAP is colored by the stage-wise inferred pseudotime; darker regions (toward blue) indicate starting points and brighter colored regions (toward yellow) indicate ending points.(E) A heatmap illustrates peaks of gene expression ordered by the pseudotime of differentiation.Selected developmental markers of the lung and the liver are displayed.(F) Violin plots of stage specific marker genes are shown for the real time points of sampling in days.(G) Circular bar chart showing enriched gene categories taken from UniPort Keywords, GO terms and KEGG pathways corresponding to the pluripotent cells, DE, FE and NKX2-1+ progenitors (FDR <5%).

Figure 4 .
Figure 4. Continued (E) A heatmap showing unsupervised clustering of top differentially expressed genes along the lung and liver pseudotime trajectory, respectively.(F) Line plots showing the expression dynamics of the indicated genes in the respective branches in accordance with the binned pseudotime ordering (vertical lines represent confidence intervals of 95%).(G and H) Representative flow cytometry and the quantification of NKX2-1-eGFP+ lung progenitors on day 15 of differentiation, with or without treatment by DAPT or SB431542 as indicated in days 11-15 (bars represent mean +SD, n = 4 biological replicates, **p < 0.01 by unpaired, one-tailed t-test).(I)The respective fold change of NKX2-1, AFP, and FOXA2 on day 15 relative to the parental undifferentiated cells (quantified by RT-qPCR 2(-DDCT)  , bars represent mean +SD, n = 3 biological replicates, unpaired, one-tailed t-test).(J) A graphical summary of genes, their phases of expression, and mechanisms underlying the differentiation of lung progenitors and hepatoblasts from pluripotency state.

TABLE
d RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d EXPERIMENTAL MODEL AND STUDY PARTICIPANT DETAILS B Stem cell maintenance and passaging