Highly cooperative chimeric super-SOX induces naive pluripotency across species

Our understanding of pluripotency remains limited: iPSC generation has only been established for a few model species, pluripotent stem cell lines exhibit inconsistent developmental potential, and germline transmission has only been demonstrated for mice and rats. By swapping structural elements between Sox2 and Sox17, we built a chimeric super-SOX factor, Sox2-17, that enhanced iPSC generation in ﬁve tested species: mouse, human, cynomolgus monkey, cow, and pig. A swap of alanine to valine at the interface be-tween Sox2 and Oct4 delivered a gain of function by stabilizing Sox2/Oct4 dimerization on DNA, enabling generation of high-quality OSKM iPSCs capable of supporting the development of healthy all-iPSC mice. Sox2/Oct4 dimerization emerged as the core driver of naive pluripotency with its levels diminished upon priming. Transient overexpression of the SK cocktail (Sox+Klf4) restored the dimerization and boosted the developmental potential of pluripotent stem cells across species, providing a universal method for naive reset in mammals.


In brief
Certain structural elements of Sox17 could enhance Sox2's ability to generate iPSCs by stabilizing Sox2/Oct4 dimerization on regulatory DNA elements that control pluripotency.This study highlights an engineered superreprogramming factor Sox2-17 and reveals the key mechanism driving complete developmental reset.

INTRODUCTION
The discovery of induced pluripotent stem cells (iPSCs) by Takahashi and Yamanaka 1 has made enormous contributions to basic research, allowed new strategies for drug discovery, and provided a source for cell replacement therapy. 2 Pluripotent stem cells (PSCs) are unique in their ability to give rise to all tissues of the animal body; as such, they are the most developmentally potent cells we have in culture.The induction of pluripotency in somatic cells requires a complete epigenetic reset, which was once thought to be impossible. 3ct4, Sox2, Klf4, and cMyc (OSKM)-all components of the Yamanaka cocktail-evolved to not only induce pluripotency in the inner cell mass (ICM) of the embryo 4,5 but also allow or even drive subsequent differentiation.Oct4, Sox2, and Klf4 (OSK) are pioneer transcription factors (TFs) capable of engaging silent chromatin; iPSC technology harnesses their pioneering ability to rejuvenate somatic cells in vitro. 6,7Oct4 stands out as the master regulator of the pluripotency network.Oct4 knockout in embryonic stem cells (ESCs) leads to a collapse of pluripotency. 8,9Interestingly, downregulation, but not complete elimination, of Oct4 expression in ESCs leads to the opposite-stabilization of the pluripotency network, suggesting an additional role of Oct4 in differentiation. 10Oct4 has been considered the only factor that cannot be replaced by other members of its family in iPSC generation. 11However, it is endogenous Sox2 activation that signifies the completion of pluripotency induction. 12oreover, exogenous Oct4 causes a loss of developmental potential for OSKM versus SKM iPSCs, 13 suggesting that fine-tuning Oct4's functions might help to advance iPSC technology.
During mouse development, the future cell fate is biased already at the 4-cell stage, where high Sox2 expression and long-lived Sox2/Oct4 co-binding drive the emergence of the ICM. 14,15Mice and humans have different degrees of Oct4 dependence when establishing pluripotency during early development: Oct4 knockout mouse blastocysts still develop a Nanog + ICM, whereas human OCT4-null blastocysts fail to do so. 16,17Correspondingly, SKM induction is sufficient to induce pluripotency in mouse somatic cells, 13,18 but not in humans, 19 emphasizing the need to develop alternative strategies to improve the fidelity of non-murine reprogramming.
Oct4 cooperates with Sox2 to co-regulate most of its targets in pluripotent cells. 20Sox2/Oct4 cooperativity is mediated by protein-protein interaction between their DNA-binding domains and by DNA allostery. 21At the onset of reprogramming, when native sites are inaccessible, Oct4 and Sox2 often bind independently 7 ; however, the sites engaged by both will most likely be opened. 22,23Sox2/Oct4 heterodimerization, particularly on the canonical HoxB1-like SoxOct motifs, was shown to be essential for the induction and maintenance of pluripotency. 24][27][28][29][30] Jauch et al. discovered that a single residue swap between Sox17 and Sox2, glutamate to lysine at HMG box position 57 (Sox17 E57K ), shifts its binding preference to the canonical SoxOct converting Sox17 into a pluripotency inducer. 30Furthermore, the larger and more potent Sox17 C-terminus transactivator can enhance Sox2 function. 25,31,32Here, we found that replacing Sox2 with Sox17 E57K in the reprogramming cocktail can rescue disabling Oct4 mutants and allows iPSC generation with somatic POU factors.We generated a library of chimeric Sox2-Sox17 TFs to find the structural elements of Sox17 responsible for this striking phenotype.The library screen allowed us to build an enhanced chimeric reprogramming factor that does not occur in nature.Our insights into the structure/function paradigm of Sox2 and Oct4 have major implications for understanding early development.

RESULTS
Defining the structural elements of Sox17 that enable iPSC generation Oct4 (Pou5f1) is the only TF of the POU family that can induce pluripotency in mice and humans, 1,33,34 unlike other family members such as Oct1, Oct2, Oct6, and Brn4. 11,35,36POU TFs exhibit different preferences for hetero-versus homo-dimerization. 22,35,37In our search to find what makes Oct4 unique among POU factors, we studied its reprogramming ability in comparison with Brn4. 36We discovered that Sox17 E57K , 30 but not Sox2, can efficiently generate iPSCs in combination with Brn4 (Figure 1A).POU factors possess a DNA-binding POU domain, flanked by unstructured N-and C-terminal transactivator domains (NTD and CTD).The POU domain is bipartite, consisting of a POUspecific (POU S ) and POU-homeodomain (POU HD ) joined by a flexible non-conserved linker.The Oct4-but not Oct1-linker contains an alpha-helix at its N terminus. 38,39Replacement of the Oct4-linker with those from other POU factors or the L80A mutation in the linker helix is detrimental for induction and maintenance of pluripotency. 38,40,41Surprisingly, Sox17 E57K could also rescue the reprogramming ability of Oct4 L80A (Figures S1A  and S1B).
Sox2 AV enhances Sox2/POU dimerization on canonical SoxOct motifs Residue A61 of Sox2-HMG faces the Oct4-POU S when cobound to a consensus SoxOct motif (Figure 2A).The extra methyl groups on valine make it more hydrophobic than alanine.Molecular dynamics simulations (MDSs) of the Sox2/Oct4 dimer on Pou5f1 distal enhancer DNA element (Oct4DE) showed that A61V increases the average number of interactions between residue 61 and the POU S (Figure 2B).MDS of Oct4/Sox2 and Oct6/ Sox2 versus the respective Sox2 AV dimers on HoxB1 enhancer SoxOct DNA showed a similar increase in interactions (Figure S2A).In both sets of MDS, POU residue I21, conserved in Oct4 and Oct6, engaged V61 the most (Figures 2C and S2B).
Our MDS revealed a Sox2/Oct4 dimer configuration where HMG residues R50 and K57 form salt bridges with E82 and Q81 of the Oct4 linker (Figure 2A), similar to our previous report. 27This arrangement involves both POU S and linker, hence the SL configuration, as opposed to the S configuration that involves only the POU S (Figure 2D).The SL configuration is Oct4 specific, as it was never observed for Oct6, which lacks negative charges in its linker (Figure 2D).The SL configuration dominated our Sox2/Oct4 simulations on Oct4DE, which were run with an AlphaFold-predicted Oct4 structure. 42However, salt bridges between Sox2 and the Oct4-linker (residues E82 and E78) also occurred in simulations on HoxB1 and Nanog regulatory DNAs, which were run with an Oct4 structure derived by crystallography 38    (legend continued on next page) base-pair gap, 43 revealed a Distant S (DS) configuration that involves Sox2's R75, 24 T78, and T80, but not A61 (Figure 2E).We overexpressed FLAG-tagged Oct1, Oct2, Oct4, Oct6, Brn2, and Brn4 in HEK293 cells, adjusted for comparable expression (Figure S3A), and used the lysates for electromobility shift assays (EMSAs) on the Nanog promoter locus containing a canonical SoxOct motif. 44Monomer binding was comparable among POU factors with the exception of Oct1, whereas Oct4 showed the strongest heterodimerization with Sox2.Both Sox2 AV and Sox2-17 displayed stronger heterodimer bands with all tested POU TFs (Figure S3B).
We replaced all 17 residues of the Oct4-linker domain with poly-glycine linkers of different lengths (GL3-30) (Figure 3A).Such flexible linkers were detrimental for reprogramming with Sox2, but GL15-30 was rescued by Sox2 AV (Figure 3A).We truncated Oct4 transactivators, both of which are crucial for reprogramming. 46Neither Oct4DNTD nor Oct4DCTD could generate iPSCs when combined with Sox2 (Figures 3B, S3E, and S3F); Sox2 AV and Sox2c17 could rescue Oct4DCTD, whereas Sox2 AV c17, Sox17 EK , and Sox2-17 rescued both Oct4DCTD and Oct4DNTD (Figures 3B, S3E, and S3F).However, none of the chimeric Soxes rescued the deletion of the Oct4-POU S , known to directly contact the Sox-HMG (Figure 3C).Sox2 AV gave rise to a few iPSC colonies with Oct4DPOU HD , verified by PCR genotyping and contribution to chimeric mice, including the germ line (Figures 3C and S3H-S3J).We conclude that Sox2 AV could rescue the deletions of any Oct4 domain except for the POU S , underlining the key role of Sox2/Oct4 dimerization in the induction of pluripotency.
We overexpressed Oct4 and Sox2 mutants in HEK293 cells (Figure 3D) and performed whole-cell lysate EMSAs.Monomer binding was similar between Sox2 and Sox2 AV ; however, A61V increased the dimerization with Oct4 on Nanog 44 and HoxB1 49 DNAs and partially rescued POU HD deletion (Figure 3E), in concordance with our reprogramming results (Figures 3C and  S3G-S3J).We performed off-rate EMSAs by adding unlabeled DNA to the pre-formed Sox/Oct/DNA complex and loading samples over a time course.Sox monomer half-lives were comparable (Figure S3K), whereas both Sox2 AV and Sox2-17 enhanced the heterodimer stability on Oct4DE 50 and Nanog elements, yet showed similar stability on the Fgf4 motif 51 (Figure 3F), in line with our structural data (Figure 2E).A portion of Sox2/Oct4/ Oct4DE dissociated immediately, although the remaining complex was long lived (Figure 3F), suggesting the presence of at least two Sox/Oct/DNA populations as in our MDS (Figures 2  and S2).We verified the whole-cell lysate results using purified proteins on Nanog and Utf1 52 DNAs (Figures 3G, S3L, and  S3M).Sox2 AV also increased the stability of heterodimers with Oct4-linker mutants and Brn4 (Figure S3N).We conclude that the unique helical linker structure of Oct4 is functionally dispensable in the context of highly cooperative Soxes.This highlights the function of the Oct4-linker in dimerization with Sox2, likely through the SL configuration (Figures 2A, S2C, and S2D), explaining the Oct4-linker's key role in pluripotency. 38,40,41ct4's pioneering function in development requires the ability to bind nucleosomal DNA, 7,53 posing a question about the role of POU subdomains in the process.6][57] Thus, the reduced dependence on the POU HD in the presence of A61V could theoretically enhance heterodimer engagement of closed chromatin.We assembled reconstituted nucleosomes using the Widom 601 sequence with a SoxOct motif at the superhelical location (SHL) + 6, previously used to resolve the Sox2/ Oct4/nucleosome. 56Our EMSAs showed that A61V dramatically enhanced the stability of the Sox2/Oct4/nucleosome complex (Figure 4A).
We performed chromatin immunoprecipitation with sequencing (ChIP-seq) for MEF samples 48 h after doxycycline (Dox) induction of KS (tetO-Klf4-IRES-Sox2/Sox2 AV ) or OKS (tetO-Oct4/Oct6+tetO-Klf4-IRES-Sox2/Sox2 AV ).HOMER motif analysis 59 showed that all OKS samples were significantly enriched in SoxOct motifs (Figures S4A and S4B).Sox2 ChIP showed no significant difference for Sox2 and Sox2 AV in KS samples and a relatively small difference in OKS samples (Figures 4B  and 4C), suggesting that A61V does not change the Sox2 binding profile.However, Oct4 binding was significantly enhanced in the presence of Sox2 AV (Figures 4B, 4C, and S4C).ChIP for both Oct4 and Oct6 showed an increased proportion of SoxOct-containing peaks in Sox2 AV compared with Sox2 samples (Figure 4D), suggesting a genome-wide redistribution of POU binding.The enhanced Sox2 AV /Oct4 dimer binding is demonstrated by the increased occupancy of Oct4 and Sox2 AV at key naive pluripotency loci (Klf2 and Oct4DE; Figure 4E).In line with our modeling and EMSA results (Figures 2E and 3F), the binding at the Fgf4 locus remained unaffected (Figure 4E).Gene ontology analysis using GREAT 60 showed that differentially bound peaks in OKS AV samples were enriched in terms associated with early embryo development, the Wnt pathway, and negative regulation of cell proliferation, whereas OKS samples were enriched in terms related to activation of cell division through Hippo and MAPK pathways (Figure S4D).
In ESCs, Oct4 and Sox2 regulate most of their target genes cooperatively by binding SoxOct motifs. 61At the beginning of the reprogramming process, the pluripotency genes are inaccessible, and the forcefully expressed Oct4 and Sox2 bind more independently, engaging thousands of non-native genomic loci. 7,22,23,48,62Enhancing Sox2/Oct4 dimerization could potentially improve the reprogramming process, as cooperativity between TFs increases their specificity. 63Indeed, already on day 2 of OKS induction, Sox2 AV engaged 511 of ESC-specific superenhancers, 64 compared with 378 for Sox2 (Figure S4E).We performed TOBIAS footprinting analysis 65 using a publicly available assay for transposase-accessible chromatin using sequencing (ATAC-seq) datasets for ESCs versus MEFs. 58Sox2/Oct4 footprints detected in ESC versus MEF represent the genomic loci to be opened during the reprogramming process.The occupancy of Sox2 AV in those key loci was slightly higher compared with Sox2, but Oct4 occupancy increased in OKS AV samples (Figure 4F), suggesting a more robust ability of the Sox2 AV / Oct4 dimer to engage closed chromatin in early reprogramming (Figures 4A and S4E).

Stabilizing Sox2/Oct4 dimerization enhances the developmental potential of iPSCs
7][68] Mouse iPSC (miPSC) lines were generated using either lentiviral tet-inducible (pHAGE2-tetO) or episomal (pCXLE) vectors, both carrying polycistronic cassettes containing either Sox2 (OSKM) or Sox2 AV (OS AV KM).Remarkably, all 10 tested OS AV KM iPSC lines supported full-term devel-opment of the aggregated embryos, whereas 3 out of 8 tested OSKM lines were incapable of supporting full-term development, echoing previous studies (Figure 4G; Table S1). 13,69,70n average, OS AV KM iPSCs gave rise to more than twice as many all-iPSC full-term pups as OSKM (Figure 4G; Table S1).OSKM all-iPSC mice rarely survive to adulthood 13,70,71 : of 25 pups born from 9 tetO-OSKM iPSC lines, none gave rise to adult all-iPSC mice (Table S1; Velychko et al. 13 ).On the other hand, 4 out of 6 tetO-OS AV KM lines gave rise to adult all-iPSC mice: of 68 tetO-OS AV KM pups, 16 became healthy adults with 50% survival for the best-performing iPSC line (Figures 4G and 4H; Table S1).The tetO-OS AV KM all-iPSC mice were fertile; the transgene inheritance was confirmed by PCR genotyping (Figure 4I).Episomal vectors deliver milder overexpression and give rise to overall better quality iPSCs, even in the presence of exogenous Oct4. 13However, only 4.2% of transferred episomal OSKM all-iPSC embryos gave rise to adult mice compared with 22.2% for OS AV KM iPSCs.The highest-quality episomal OS AV KM iPSC line outperformed the highest-quality OSKM line: 43.3% versus 15.2% of transferred embryos gave rise to adult all-iPSC mice (Table S1).Therefore, substituting a single residue of Sox2 enhances both Sox2/Oct4 dimerization capacity and the developmental potential of OSKM miPSCs.
Chimeric super-SOX enhances iPSC generation in five species Sox2-17 (S* or super-Sox), which features A61V among other Sox17 elements, emerged as our most efficient chimeric reprogramming factor (Figures 1B and 1C), drawing interest for its practical applications.We cloned Sox2-17 into tet-inducible OSKM or SKM reprogramming cassettes and confirmed comparable levels of expression using RT-qPCR (Figure S5A).Time-course experiments with restricted Dox-induction (Figure 5A) showed that Sox2-17 enhanced the kinetics and efficiency of miPSC generation, shortening the minimal induction time from 3 days to just 24 h (Figure 5B).Clonally expanded 24 h-iPSCs lost methylation of Nanog and Pou5f1 promoters and acquired methylation of the fibroblast-specific Col1a1 promoter (Figure S5B), differentiated into all three germ layers in teratoma assays, contributed to chimeric mice, including the germ line, and successfully generated live-born all-iPSC pups in 4N complementation assays (Figures S5C-S5E; Table S1).When induced for just 3-4 days, OS*KM gave rise to 10-200 times more colonies than OSKM, depending on the quality of starting fibroblasts (Figures 5B and 5C).Sox2-17 could even generate twofactor miPSCs with Klf4, albeit with low efficiency (Figure S5F).S*K miPSC lines displayed mouse ESC-like (mESC) morphology, were verified by PCR genotyping, stained positive for Nanog and SSEA-1, and gave rise to three germ layers in a teratoma assay (Figures S5G-S5J).These data suggested that Sox2-17 requires shorter time, lower levels of expression, and a reduced number of additional factors to successfully induce pluripotency, which could be beneficial for the less efficient integration-free reprogramming methods.We generated episomal polycistronic OKS and OKS* vectors, carrying either Sox2 or Sox2-17, respectively, and confirmed the expression by western blot (Figure S5K).Sox2-17 enhanced episomal OKS MEF reprogramming by a striking 150 times, giving rise to high-quality miPSCs that could generate all-iPSC mice with up to 77% efficiency (Figures 5D  and 5E; Table S1).Remarkably, all 10 tested OKS and OKS* iPSC lines gave rise to healthy adult mice with a survival rate similarly high for both Sox2 and Sox2-17 (Figure 5F; Table S1).This highlights that omitting Myc benefits the developmental potential of miPSC 70 ; super-Sox offers a practical advantage by enhancing the OKS cocktail's efficiency.
Although the generation of integration-free bona fide iPSCs is well established in mice and humans, the same cannot be said for many other species, including non-human primates (NHPs) and livestock.6][77] OSKML failed to yield iPSCs despite multiple attempts, whereas OS*KML gave rise to alkaline phosphatases-positive (AP + ) iPSC-like colonies that could be clonally expanded (Figure 5O).Although most hiPSC lines lose the episomes before passage 3, only 3 of 11 tested cynomolgus iPSC (ciPSC) lines lost the episomes; 2 of 3 integration-free lines had the correct chromosomal number, both displayed hiPSClike morphology, expressed NANOG and OCT4, and differentiated into three germ layers in teratoma assays (Figures 5N, 5O, S5P, and S5Q).
We attempted to reprogram porcine and bovine fibroblasts using bFGF-based (StemFlex) media supplemented with XAV939, a Wnt inhibitor shown to support livestock ESC culture. 78pisomal reprogramming using WT-SOX2 failed, whereas SOX2-17 efficiently generated AP + colonies for both the pig and the cow that could give rise to clonal iPSC lines, which could be expanded beyond 12 passages (Figures 5P and 5Q).We established 12 bovine iPSC (biPSC) lines generated by OS*KML without p53 inhibition, which all lost the episomes by passage  (legend continued on next page)

(Figures S5R and S5S
). biPSCs maintained ESC-like morphology, the correct number of chromosomes, and stained positive for SOX2 and OCT4 (Figures 5Q, S5T, and S5U).Thus, SOX2-17 allowed the generation of integration-free virus-free biPSCs with potential applications for cultivated beef and livestock gene editing.
We generated and characterized 30 clonal hiPSC lines derived from newborn foreskin (young, Y-iPSCs) and 56-year-old dermal (old, O-iPSCs) fibroblasts using episomal OSKML carrying WT-SOX2, SOX2 AV , or SOX2-17.All hiPSCs were integration-free with normal karyotypes (Figures S5V-S5X).Hierarchical clustering of both RNA-seq and reduced representation bisulfite sequencing (RRBS) 79 data showed that all hiPSC lines clustered far from fibroblasts and close to hESC lines (Figures 5R and 5S).The gene expression differences correlated more with the cell source rather than the SOX factors used.80,85 We analyzed 23 differentially methylated regions (DMRs) represented in all samples and found that all lines including the original fibroblasts had different levels of LOI (Figure 5T).OS AV KML-hiPSCs derived from young fibroblasts showed significantly lower levels of LOI compared with respective OSKML-iPSCs, whereas the differences between other hiPSCs were not significant.We conclude that highly cooperative Sox factors facilitate or enable iPSC generation in mammalian species (Figure 5U).

Sox2/Oct4 dimerization is at the core of naive pluripotency
The ESC derivative from mouse pre-implantation ICM mESCs and miPSCs represent the ''naive'' state; their proliferation in culture is dependent on LIF. 86Naive mPSCs readily contribute to chimeric animals, and some lines are even capable of generating all-PSC mice.Conversely, PSCs from most other species, including humans, do not readily maintain the naive state and are typically stabilized in the ''primed'' state, which depends on FGF for proliferation.Mouse epiblast stem cells (mEpiSCs) derived from the post-implantation blastocyst are also primed-they share many characteristics with hPSCs, most importantly the low developmental potential. 87,88ct4DE is active in naive but not primed PSCs in different species, [89][90][91][92] and both Sox2 AV and Sox2-17 increase the stability of the Sox2/Oct4 dimer on Oct4DE DNA (Figures 3F and 4E).We hypothesized that Sox2/Oct4 dimerization could be at the core of naive pluripotency.
We analyzed a published ATAC-seq dataset of time-course naive-to-primed transition samples generated by exposing mESCs to FGF. 93 The most significant changes occur between day 1 and day 2 of priming (Figure 6A).TOBIAS 65 footprinting analysis showed that the most depleted footprints between day 0 and day 1 were of Esrr and Klf factors (Figure 6B), con-sistent with previous studies showing that Klf4 or Esrrb can reset mEpiSCs to the naive state. 94,95More importantly, the day 1 / day 2 changes in chromatin landscape were dominated by the reduction of Sox/Oct and Sox footprints (Figure 6C).
We performed whole-cell lysate EMSAs using the Nanog element to measure the dimerization levels between Sox2 and Oct4 proteins endogenously expressed in different PSC lines: naive mESCs grown in KSR-LIF media, primed mEpiSCs carrying Oct4DE-GFP reporter (Gof18) [96][97][98][99] grown in FGF-based hESC media (StemFlex), mEpiSCs after naive reset grown in LIF or 2iLIF media, 100 and hiPSCs grown in hESC media (Figure 6D).The primed mEpiSCs and hiPSCs had significantly lower levels of Sox2/Oct4 dimer compared with mESCs, but the heterodimerization was restored in mEpiSCs after the naive reset by a brief exposure to MEK inhibitor, PD0325901, and sorting for Gof18 + .The heterodimerization was enhanced even further if the same cells were cultured in the presence of 2i (Figure 6D), which potentially points to the mechanism of the mouse naive media. 100Primed cells of both species had more than twice lower ratio of Sox2/Oct4 dimer to Oct4 monomer binding compared with naive samples (Figure 6D).Antibody supershift confirmed the composition of EMSA bands (Figure 6E).The limited Sox2/Oct4 dimerization was due to lower Sox2 protein levels in primed cells, whereas there was no significant difference in Oct4 levels (Figures 6F and 6G).These data corroborate previous reports showing that mouse-primed cells have lower Sox2 expression compared with naive cells 99,101 ; primed but not naive cells could even tolerate Sox2 knockout. 102EpiSCs could be converted to the naive state by overexpression of Klf4 95 ; however, lentiviral Klf4 alone could not reset human-primed iPSCs in KSR-LIF media (Figure 6H).Screening of different subsets of OSKM showed that SK (Sox2+Klf4) is the minimal cocktail that enables the generation of KLF17 +103-106 hiPSCs.SK reset worked even in the absence of small molecule inhibitors (Figure 6H), but supplementing media with PD0325901 enhanced the efficiency of the reset (Figure S6A).Analogous to SKM miPSC generation, 13,18 combining Sox2 and Klf4 in a bicistronic vector proved crucial for the efficient naive reset of hiPSCs (Figure S6A).
We generated human episomal reprogramming plasmids mCherry-SK (pCXLE-mCherry-T2A-SOX2-P2A-KLF4) and mCherry-S*K (pCXLE-mCherry-T2A-SOX2-17-P2A-KLF4) to achieve a traceable integration-free naive reset.The episomal vectors were lipofected into mEpiSCs, and the mCherry + / Gof18 À cells were sorted on day 2 and plated on feeders in KSR-LIF media (Figures 6I and S6B).The majority of surviving cells formed dome-shaped colonies that were Gof18 + /mCherry À as early as day 4 after plating.We picked and clonally expanded 6 colonies for each cocktail.Both SK-and S*K-converted lines exhibited significantly higher Sox2/Oct4 dimerization than untransfected mEpiSCs, which correlated with increased Sox2 protein levels (Figures 6J, S6C, and S6D).SK-reset mEpiSC lines exhibited on average a 4-fold increase in heterodimer band  intensity, compared with a 6-fold increase in S*K-reset mEpiSCs.The Sox2-17/Oct4 dimer band was not present in any of the S*K naive lines confirming that the episomal vectors were no longer expressed (Figure S6C).Compared with S*K-reset lines, the SK-reset naive lines had a significantly higher propensity to spontaneously lose Gof18 + status after passaging (Figures 6K and S6E), suggesting that S*K delivered a more stable naive reset.
Our data suggest that a decrease in Sox2/Oct4 dimerization is likely responsible for the downstream epigenetic changes that lead to diminished developmental potential upon priming of pluripotent cells in development and culture.Forced expression of Sox2 and Klf4 can efficiently reverse priming and convert mouse and human PSCs into the naive state (Figure 6L).Super-SOX, which exhibits enhanced cooperativity with POU factors, promotes both iPSC generation and naive reset, underscoring the key role of Sox2/Oct4 dimerization in naive pluripotency (Figures 6L and  6M).It would be interesting to investigate if the WT-Sox17+Klf4 cocktail can redistribute the Oct4 binding sites to compressed SoxOct motifs inducing primitive endoderm, 27,45 similarly to how Sox2+Klf4 induces the pre-implantation epiblast fate.

Episomal SK reset enhances the developmental potential of PSCs in three species
We co-nucleofected hiPSCs grown in primed media (StemFlex) with episomal mCherry-S*K and pCXWB-EBNA1 vectors and plated on feeders.After 48 h, the media was changed to human naive media (RSeT).By day 7, S*K-treated hiPSCs, but not control-nucleofected cells, generated dome-shaped colonies positive for human naive pluripotency markers SUSD2 [107][108][109] and KLF17 [103][104][105][106] (Figures 7A-7C).Day 7 SUSD2 + hiPSCs were mCherry À confirming the transgene-independent status of the generated naive cells.[112][113][114] We performed RT-qPCR to assess the expression of key naive pluripotency genes (Figure 7E).S*K reset led to a significant upregulation of DNMT3L, KLF17, and ARGFX in both primed and naive media.The naive media alone did not increase naive gene expression, except for a 6-fold upregulation of KLF4.Both fluorescence-activated cell sorting (FACS) and RT-qPCR data confirmed that the mCherry-S*K episome was eliminated from the cells by day 7, whereas the mCherry control plasmid persisted (Figures 7B and 7E), suggesting that S*K reset might trigger transgene silencing mechanisms as previously shown for mESCs 116 and the SKM cocktail. 13o test the developmental potential of our putative naive hiPSCs, we aggregated S*K-reset cells sorted for SUSD2 + at day 7 with mouse embryos at morula stage E2.5. 117Astonishingly, S*K-reset hiPSCs marked with constitutive RFP expression were detected in the ICMs of the majority of cross-species aggregated embryos (Figure 7F).The cross-species chimerism was confirmed with co-staining of the chimeric embryos at E4.5 with human-specific SUSD2 and mouse-specific Oct4 antibodies.Human SUSD2 + cells were integrated into ICMs of 6 out of 11 embryos.In one case, the immunostaining indicated that S*K-reset hiPSCs took over the whole epiblast region (Figure 7G), which suggests that high levels of Sox2/Oct4 dimer might grant pluripotent cells an advantage in embryonic cell competition (Figure 7H). 115nitially, our OS*KML biPSCs failed to generate teratomas in severe combined immunodeficient (SCID) mice.Similarly, cultured bovine ESCs do not readily give rise to teratomas and lose to humans in cross-species cell competition (Figure 7H). 115e injected control or S*K-reset biPSCs (Figure 7I) into opposite sides of the same mouse.A teratoma arose only from the S*Kreset sample, containing tissues representing all three embryonic germ layers (Figure 7J).
Finally, we nucleofected episomal mCherry or mCherry-S*K into a poor-quality naive female mESC line (C57BL/6J background) cultured in 2iLIF media.Emerging colonies were picked at day 5 and used for 4N complementation (Figures 7K and 7L).
Remarkably, the S*K-reset cells generated 8 times more fullterm all-ESC pups compared with the control (Figures 7M-7O; Table S1).Three S*K-reset all-ESC pups survived foster nursing, whereas the only control all-ESC pup died shortly after birth.The SKM cassette, 13 particularly when containing Sox2 AV , might further improve the naive reset, given the outstanding developmental potential of OS AV KM iPSCs (Figure 4 and Table S1).
The in vivo evidence for enhanced development potential in three species presented in this section, most importantly the birth of S*K-reset all-ESC animals, argues in favor of our proposed ''heterodimer model'' of pluripotency continuum (Figure 6L).

DISCUSSION
iPSC technology struggles with inefficiency and widely variable quality of the produced cell lines. 69,80,81,118Some alternative reprogramming cocktails could improve the developmental potential of miPSCs, but they also decreased the reprogramming efficiency 13,70,71 and, consequently, failed to reprogram human cells that possess stronger epigenetic barriers. 19To date, only the generation of all-iPSC mice has been reported, 66,67,119 and germline competence has only been demonstrated for mouse (both sexes) and male rat iPSCs, 120 highlighting the limitations of current technology.Here, we combined structural elements of Sox2 and Sox17 to build a chimeric super-Sox that enhanced reprogramming in five species: mouse, human, cynomolgus macaque, cow, and pig.The key point mutation, A61V, which stabilized Sox/Oct dimer on DNA, increased the developmental potential of OSKM miPSCs, as evidenced by higher rates of fullterm development and survival of all-iPSC mice.
Oct4 functions independently of Sox2 to drive proliferation. 121,122Notably, the cocktails 13,70 and culture interventions 123,124 that yield higher-quality iPSCs also reduce cell proliferation during reprogramming, suggesting that limiting proliferation is beneficial.This can be achieved by enhancing Sox/Oct dimerization, as in OS AV KM reprogramming; increasing Sox2:Oct4 ratio, as in SKM reprogramming 13 and Oct4 heterozygous knockout ESCs 10 ; and omitting or reducing Myc, as in OKS 70 or OSKM compared with OKSM cassette, 69 and Mycdepleted ESCs. 125ce are the only species, for which naive PSCs have been stabilized in culture without the use of small molecule inhibitors. 126Mice likely evolved (or preserved) the unusual stability of their naive pluripotency fate to enable a blastocyst-stage embryonic arrest, known as diapause. 127][133][134][135] Long-term culture in naive media leads to epigenetic abnormalities and loss of germline competency for both humans and mice, 81,136,137 whereas a short exposure during reprogramming could be beneficial for hiPSC quality. 124Contrary to mESCs, which exclusively contribute to the epiblast, chemically reset naive hESCs can also contribute to the trophectoderm, 138,139 which could be attributed to the low levels of Sox2/Oct4 dimerization. 106,140,141The OSKM cocktail can induce naive pluripotency from somatic cells in mice 66,67 and humans. 124,142In particular, the role of Klf4 in naive pluripotency has been described for both species. 111Here, we showed that a subset of Yamanaka's cocktail containing Sox2 and Klf4 could induce naive reset in both mouse and human PSCs even in the absence of small molecule inhibitors.SK-reset links iPSC quality to the naive-primed continuum and explains the enhanced developmental potential of SKM miPSCs. 13Episomal S*K reset improved the developmental potential in humans (evidenced by cross-species embryo aggregations), cows (generating teratoma-capable biPSCs), and mice (boosting all-ESC animal production).The in vivo evidence for naive reset in three species supports our proposed heterodimer model of a naive-to-primed pluripotency continuum, which elucidates the roles of Yamanaka factors: high levels of Sox2 and Klf4 expression and Sox2/Oct4 dimerization promote the naive state, whereas decreased Sox2 reduces the heterodimerization, and when coupled with excess Oct4 and Myc, promotes cell proliferation and priming.The evolutionary tree of animals suggests that of the Sox2/ Oct4 couple, in the beginning, there was Sox2.Key Sox2 residues, such as R50 and K57, are already present in sponges, where SoxB TFs control early embryogenesis. 1435][146] POU5 factors emerged much later in the evolutionary tree-it is an innovation of vertebrates, 147,148 where POU5 TFs cooperate with Sox2 to control early development. 14,149The Oct4-linker is not directly involved in DNA binding; however, it is important for reprogramming to pluripotency and for normal development. 38,40,41Here, we found that Oct4linker mutations reduce stability of the Sox2/Oct4 dimer on DNA, but Sox2 AV could rescue linker mutants' ability to heterodimerize and to induce pluripotency.Likewise, Sox2 AV enabled heterodimerization and reprogramming with tissue-specific POU factors.Our MDS revealed that the negatively charged Oct4-linker residues form salt bridges with positively charged R50 and K57 of Sox2.Although the linker is the least conserved POU subdomain, its negative charges are already present in the POU5 factor of jawless hagfish. 147It has been suggested that two distinct POU5 factors that still exist in many vertebrates could support either naive or primed pluripotency, 147 possibly through their differential ability to cooperate with Sox2.Our work demonstrates that the most significant feature that distinguishes Oct4 from other POU factors is its ability to form a stable heterodimer with Sox2 that had already been in control of early embryogenesis in lower animals.
1][152] In early animal development, unidirectionality is likely achieved by a negative feedback loop limiting the return to a high-Sox state. 153Interestingly, female mPSCs have lower developmental potential compared with male. 154,155Our model suggests that the reason for higher developmental pluripotency in male lines could be the expression of sex-determining region of Y (Sry), 156 which has a Sox2-like DNA-binding motif.The ''high-Sox'' hypothesis could also explain the increased survival of male versus female embryos in humans and other mammals 157,158 and the higher occurrence of Sox-driven cancers in men versus women. 159The abundance of Sox footprints in the open chromatin of naive versus primed cells suggests a more general developmental trend, where various ratios of Sox factors and their partners predispose stem cells toward specific lineages.
Our engineered super-SOX factor harnesses the reprogramming powers from naturally evolved structural elements of two major development regulators, Sox2 and Sox17.Even more efficient reprogramming factors could potentially be built by means of rational engineering and directed evolution. 160,161Our data suggest that enhancing cooperativity between key co-factors should be one of the goals of future designers.

Limitations of the study
A small number of cells from the ICM can contribute disproportionally to animal development. 162Thus, the developmental potential of a given PSC line may be determined by a Sox2/ Oct4-high subpopulation rather than the average measured by ATAC-seq, EMSA, and western blot experiments.Further studies are needed to characterize the SK-reset naive PSCs and address the posttranslational modifications and other mechanisms regulating Sox2/Oct4 dimerization.Our current episomal reset protocol produces naive PSCs transiently-between days 4 and 7, requiring their use in downstream applications before re-priming occurs.A culture media supporting long-term maintenance of transgene-independent non-murine naive PSCs with high heterodimer levels remains to be formulated.
For this study, we generated OSKML hiPSC lines using the construct carrying shRNA against TP53, 74 which knocks down the main tumor suppressor boosting cell proliferation.TP53 knockdown is likely detrimental to the iPSC quality and could have caused LOI in our hiPSCs.
We cannot exclude that a highly cooperative Sox or excess of Sox2 may participate in the developmental reset in ways beyond enhancing Sox/Oct dimerization, e.g., by remodeling the epigenome through recruiting the aging antagonist Parp1 163 or silencing retroviral elements. 13

Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Sergiy Velychko (Sergiy_Velychko@hms.harvard.edu).

Materials availability
Plasmids generated in this study have been deposited to Addgene (#193290-210020).

Data and code availability
d ChIP-seq, RNA-seq, RRBS data have been deposited at GEO (GSE247051) and are publicly available as of the date of publication.Accession numbers are listed in the key resources table.The DOI is listed in the key resources table.d This paper analyzes existing, publicly available data.These accession numbers for the datasets are listed in the key resources table.d This paper does not report original code.d Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS Mice
All mice used were bred and housed at the mouse facility of the Max Planck Institute in M€ unster.Animal handling was in accordance with MPI animal protection guidelines.
The surrogate mouse embryos for tetraploid complementations were obtained by breeding super-ovulated B6C3 F1 females with CD1 males, a pairing that results in yellow coat color, and the surrogate mothers were pseudopregnant CD1 females (white).Rosa26-rtTA/Gof18 miPSCs have dark brown coats.While we present tetraploid complementation data for both sexes, for direct comparison between OSKM versus OS AV KM cocktails we focused on male iPSCs, as male PSC lines have higher developmental potential.The experiment on the female mESC line (C57BL/6J background) illustrates that our findings apply for both sexes.
For naı ¨ve reset hiPSCs, the media was changed to RSeTÔ (STEMCELL Technologies) at day 2, or StemFlex was kept (as indicated).For naı ¨ve reset biPSCs, the media was changed to mESCs supplemented with 1 mM PD0325901 (Cayman Chemical) and 2 mM XAV939 (Sigma).
Pluripotent stem cells of all five species were passed using Accutase (Sigma).10 mM Rho-associated kinase inhibitor (ROCKi, Y-27632, Abcam) was added for the first 24h after passaging of primed PSCs of all five species (extended to 48h for mouse EpiSCs).The cells were routinely tested for Mycoplasma contamination and tested negative.
High FiveÔ and Sf9 insect cells were grown in serum-free EX-CELLÒ 420 medium containing L-glutamine (Sigma) and maintained in suspension culture at 0.5-1x10 6 cells/mL.Cultures were incubated at 26 C shaking at 100-120 rpm depending on flask size in a refrigerated shaking incubator (AutoQ Biosciences -AQ-2402D).
Microbe strains TOP10 chemically-competent E. coli grown in Luria broth (LB) was used for plasmid amplification.For baculovirus plasmid DNA amplification, DH10EMBacY 171 (a gift from Dr. Imre Berger) were plated on agar plates containing LOC media supplemented with 50 mg/mL kanamycin, 10 mg/mL tetracycline 7 mg/mL of gentamicin, Bluo-Gal 100 mg/mL, and 1 mM IPTG.Selected colonies were grown in LOC media supplemented with 50 mg/mL kanamycin, 10 mg/mL tetracycline, and 7 mg/mL of gentamicin (Sigma).Stbl2 (Invitrogen) or NEB Stable competent E. coli grown in LB supplemented with 100 mg/ml of ampicillin or carbenicillin were used for preparing episomal plasmids.
The mouse and human protein sequences of Sox2 Mouse reprogramming experiments were done as described before. 13,36Briefly, for retrovirus production monocistronic pMX-Oct4, Sox (Addgene #193350-193354), and Klf4 vectors were co-transfected with pCL-Eco (Addgene #12371) 172 in HEK293 cells with FuGENE6 (Promega) using low volume transfection protocol (Steffen et al., 2017).For lentivirus production, pHAGE2-tetO vectors were co-transfected with PAX2 and VSV.The viral supernatants were harvested after two and three days, filtered (Millex-HV 0.45 mm; Millipore) aliquoted and stored at -80 C. For reprogramming, Oct4-GFP MEFs (OG2 or Rosa26rtTA-Gof18) were plated on gelatin-coated 12-well plates at 3x10 4 cells per well in fibroblast media.A few hours later the cells were infected with titer-adjusted volumes of each viral supernatant supplemented with 6 mg/ml (final concentration) of protamine sulfate (Sigma).After two days, the media was replaced with mESC media.For mouse tet-inducible reprogramming, the cells were treated with Dox for 10 days (same as 13 ), unless otherwise stated.Because all the reprogramming experiments were treated equally, enhanced kinetics of OS*KM reprogramming resulted in mature tetO-OS*KM expressing the Myc-containing transgene for much longer compared to tetO-OSKM or tetO-OS AV KM iPSCs that emerged later in the 10-day course.This likely explains the poor quality of tetO-OS*KM versus tetO-OS AV KM iPSCs.The 24h-iPSCs were derived from MEFs that by infecting them with lentiviral tetO-OS*KM and exposing them to Dox for 24h.Clonal iPSC colonies were picked after day 10, and propagated in the same manner as for other iPSC lines.We do not claim that reprogramming of MEF to iPSC was completed in just 24h.Rather, we posit that a 24-hour induction with OS*KM is sufficient to induce complete pluripotency.
For human retroviral reprogramming, 48h after infection, the transduced cells were split on a CF1 feeder layer at 10 4 per 6-well plate.After one week, fibroblast media was changed to hESC media.
Human self-replicating RNA-based reprogramming was performed as previously described. 72Briefly, the T7-VEE constructs (Addgene #58974, 193356) were digested with MluI and then in vitro transcribed using RiboMAX Large Scale RNA Production System Kit (Promega).The transcripts were 2 0 -O-methylated, capped, and poly(A)-tailed using respective CELLSCRIPT kits following the manufacturer's protocol.For reprogramming, 1 mg of RNA replicons were transfected into 10 5 fibroblasts on 6-well plates using RiboJuice (Sigma) in the presence of 100 ng/ml B18R (Promega).The media was supplemented with 0.5 mM VPA, 5 mM EPZ to enhance the very inefficient RNA-based reprogramming.The reprogramming worked more efficiently when no puromycin selection was used.After two weeks, the cells were sorted for TRA-1-60 and plated on a CF1 feeder layer in hESC media without B18R (Figure 5I).
The virus supernatant volumes were adjusted according to RT-qPCR titration using common WPRE or 3'UTR primers normalized to Rpl37a. 13All the tetO lines were screened for promoter leaking, only those with minimal leaking were selected for characterization.The newly generated iPSC lines (mouse, human, cynomolgus, and cow) were karyotyped using DAPI staining of metaphase spreads, only the lines with correct chromosomal numbers were selected for characterization.As we reported before, 13 no difference in aneuploidy occurrence was observed between different cocktails.Similar to other studies, we only tested the quality of male iPSCs for this work.

Tetraploid (4N) complementation assay
Preparation of tetraploid embryos Super ovulated B6C3 F1 females were mated with CD1 males.E1.5 embryos at the two-cell stage are flushed from the oviducts and collected in M2 medium.
After equilibration in fusion solution (0.3 M D-mannitol, 50 mM CaCl2, 0.3% BSA (Sigma)), 50-75 embryos are placed between the electrodes of a 250 mm gap electrode chamber (BLS Ltd.) containing 0.3 M mannitol with 0.3% BSA and fused with a Cellfusion CF-150/B apparatus (BLS Ltd.) with 0.5 mm Microslide (BTX-450).An initial electrical field of 2V is applied to the embryos followed by one peak pulses of 60V for 50 ms.Embryos are transferred back into KSOM-aa medium and immediately into a 37 C incubator with 5% CO 2 .Embryos are observed for fusion after 15 to 60 minutes.The fused tetraploid embryos are cultured for 24h to the 4-cell stage under the same conditions.Aggregation of iPSCs with zona-free embryos d Preparation of aggregation plates for mouse embryos chimera production 1h before aggregation: Using a KSOM medium filled 100ml-pipette, make 4 rows of microdrops (roughly 3mm in diameter) in a 35mm dish (Falcon, Cat.No. 35-3001), two drops in the first and fourth, five drops in the second and third rows.
Cover the whole plate with paraffin oil.Sterilize the aggregation needle (BLS Ltd.) with 70% ethanol.Press the aggregation needle into the plastic through the paraffin oil and culture medium, while making a circle movement to create a tiny scoop of about 300 mm in diameter with a clear smooth wall.Six to ten holes can be made within each droplet.
d iPSCs are aggregated and cultured with denuded 4-cell stage mouse tetraploid embryos as reported with a slight modification: 174 Clumps of loosely connected iPSCs (15-20 cells in each) from short trypsin-treated day two iPSC cultures were chosen and transferred into microdrops of KSOM medium under mineral oil; each clump is placed in a depression in the microdrop.Meanwhile, batches of 30-50 embryos were briefly incubated in acidified Tyrode's solution 175 until dissolution of their zona pellucida.Two embryos were place on the iPSC clump.All aggregates are assembled in this manner, and cultured overnight at 37 C, 5% CO 2 .
After 24h of culture, the majority of aggregates have formed blastocysts.Ten to fourteen embryos were transferred into one uterine horn of each 2.5 days post coitum, pseudopregnant CD1 female that had been mated with vasectomized males.For Cesarean Section, recipient mothers were sacrificed at E19.5 and pups were quickly removed.Newborns that were alive and respirating were cross fostered to lactating females.

Lentiviral naı ¨ve reset of human iPSCs
For primed-to-naı ¨ve reset (pluripotency upgrade), human iPSCs were transduced with monocistronic or polycistronic pHAGE2-EF1a lentiviral vectors carrying different subsets of Yamanaka factors from. 13After two days, the cells were passed at low density (10 3 cells per 24-well plate) on an inactivated C3H feeder layer in mESC media supplemented with ROCKi with or without small molecules.24h later the media was changed to mESC media (KSR-LIF) with ROCKi with or without 2i.Six days after passing, the cells were fixed and stained for KLF17 (HPA024629, ATLAS, 1:500).SK was the minimal subset that gave rise to KLF17 + colonies, while neither Sox2 nor Klf4 alone did not.
The nucleofected cells were plated on a dense feeder layer in StemFlex media supplemented with ROCKi.On the second day, the media was changed to StemFlex, and on day 3 to human naı ¨ve-like media (RSeTÔ, STEMCELL Technologies); the cells were fed daily.
Alternatively, the episomal S*K reset could be performed using feeder-free primed human iPSC culture conditions in StemFlex or E8 media.
The episomal vectors could also be delivered using Lipofectamin Stem Transfection Reagent (Invitrogen, STEM00001): hiPSCs were plated in feeder-free conditions on 6 well plate (10 6 cells per well) and next day the media was changed to pure OPTIM-MEM media supplemented with ROCKi, and the cells were transfected with 3 mg of pCXLE-mCherry-T2A-SOX2-17-P2A-KLF4 and 1 mg pCXWB-EBNA1 mixed with 8 ml of lipofectamine according to manufacturer's protocol for 4 hours.Following 4-hour transfection, the media was changed to StemFlex supplemented with ROCKi overnight for recovery.On day 2, the cells were dissociated using accutase, and split on feeders in StemFlex media supplemented with ROCKi.On day 3 the media was changed to RSeT.
Increasing the ratio of pCXLE to pCXWB-EBNA1 up to 1:1 (3+3 mg for 100ul nucleofection reaction) could increase the longevity of the episome improving the efficiency of naı ¨ve reset, but could also be toxic for sensitive PSC lines.While SK is the minimal cocktail capable of inducing human naı ¨ve pluripotency, polycistronic SKM episomes 13 could further improve the reset efficiency (Addgene #210016-210018).
Gene expression analysis was performed using CYBR Green qPCR as previously described. 13the oligos for human naive reset the primers can be found in Table S2.
Mammalian cell overexpression and whole-cell lysate (WCL) generation HEK293 cells cultured on 10cm dishes were transfected with 10 mg of pLVTHM or pHAGE2 vectors under the control of an EF1a promoter and containing the WT or mutant versions of Oct4 or Sox2 with Fugene6 (Promega) using a low volume protocol (Steffen et al., 2017).Three days after transfection, the cells were dissociated from the plate using Accutase (Sigma), collected, counted, and washed with PBS.WCLs were generated by five cycles of freeze-thawing pellets resuspended in 12.5 mL per million cells in lysis buffer (20 mM HEPES-KOH pH 7.8, 150 mM NaCl, 0.2 mM EDTA pH 8, 25% glycerol, 1 mM DTT, and cOmpleteÔ protease inhibitor cocktail (Merck).After disruption, lysates were spun at 14k RCF at 4ºC for 10 min.After centrifugation, pellets were discarded and the supernatants transferred to a new tube for further analysis.Protein concentrations were estimated by diluting samples in 0.1% SDS solution, measuring A 230 and A 260 , and applying the equation: Conc:ðmg = mLÞ = ð0:183 Ã A 230 À 0:075ÃA 260 ÞÃdilution factor All samples were diluted to 1 mg/mL, aliquoted, snap frozen, and stored at -80 C. Western blots were run to compare expression levels between mutants.Expression was evaluated by Quantity OneÒ (v4.6.7,Bio-Rad) densitometry to adjust for equal amounts of expression using WCL of untransfected cells to maintain total protein content, when necessary.
Western blot analysis 5-10 mg of total protein was combined with Laemmli sample buffer, heated, and loaded onto 12% mini SDS-polyacrylamide gel (SDS-PAG) using the Towbin buffer system. 176Gels were run initially at 15V for 15 minutes to load samples into the stacking gel and then 50V for 30-60 minutes to resolve the proteins of interest.Samples were transferred to ImmobilinÒ-FL PVDF membranes (Merck Millipore Ltd.) at 4 C under 300V for 2h.Membranes were blocked for one hour at room temperature in 5% skim milk (Sigma) dissolved in PBS with 0.1% Tween-20 (PBS-T) and incubated overnight at 4 C with rotation in the primary antibody diluted in blocking solution.The following day the membrane was washed three times in PBS-T and then incubated in secondary antibody diluted in blocking solution for one hour at 25 C.The following antibodies were used: polyclonal goat anti-Oct4 N-19 (sc-8628, Santa Cruz Biotechnology) or monoclonal mouse anti-Oct4 (611203, BD Biosciences), polyclonal goat anti-Sox2 (sc-17320 from Santa Cruz Biotechnology or GT51098 from Neuromics), monoclonal mouse anti-alpha tubulin (T6199, Sigma), 647-conjugated anti-goat (Alexafluor), and 647-conjugated anti-mouse (Alexafluor).Western blot signal was detected using Fujifilm FLA-9000 fluorescence scanner (Fujifilm).

Insect cell expression and protein purification
The coding sequence of full-length Mus musculus Sox2 or Sox2 AV was cloned into pCoofy27 plasmid with an N-terminal 6xHis-tag using SLIC as previously described: forward primer 3C, reverse primer ccdB. 177Plasmids were then transformed into DH10EMBacY (a gift from Dr. Imre Berger) for baculovirus plasmid DNA amplification. 171Bacmids were purified using Macherey-Nagel Xtra BAC100 (D€ uren) and then transfected into a suspension of Sf9 cells at 0.8x10 6 cells/mL grown in serum-free EX-CELLÒ 420 medium containing L-glutamine (Sigma) and incubated at 26 C with shaking for virus production.Cells were monitored daily for increased cell size and GFP fluorescence.Once $90% of cells were GFP+, viral suspensions were spun down and then filtered through 0.22 mm.Viral supernatants were expanded once before being used for infection, filtered aliquots were stored at -80 C.
Optimal protein expression conditions were determined empirically.Mid-log phase High FiveÔ insect cells were split to 10 6 cells/ mL in 2 L and then infected with 10-12 mL of P1 baculovirus from previous steps per liter of cells.Following incubation at 28 C for 96h with shaking, cell pellets were collected by centrifugation.Pellets were resuspended in lysis buffer (20 mM HEPES pH 7.5, 300 mM NaCl, 30 mM Imidazole, 5% glycerol, 0.1% Triton X-100, cOmpleteÔ protease inhibitor cocktail (Merck), and 1 mM DTT), frozen and thawed once, then sonicated at 4 C using a probe sonicator (Bandelin Sonopuls, Bandelin Eletronics).Pellets were resuspended in inclusion body wash buffer (20 mM HEPES pH 7.5, 200 mM NaCl, 1 mM EDTA, 1% Triton X-100, cOmpleteÔ protease inhibitor cocktail (Merck), and 1 mM DTT) and subject to four cycles of Dounce homogenization followed by centrifugation for 20 min.at 18k RCF and 4 C, twice with inclusion body wash buffer and twice in buffer without Triton X-100.The final pellet was cut twice in DMSO and then incubated for 30 min at 25 C. Unfolding buffer (7 M guanidine hydrochloride, 20 mM Tris-HCl pH 7.5, 5 mM DTT) was added to the pellet and incubated while rotating for 1h at 25 C. Nickel Sepharose slurry (GE Healthcare) was washed and equilibrated in binding buffer, then supernatant was added and incubated at 4 C overnight with rotation.Proteins were eluted using the unfolding buffer with additional 500 mM imidazole.Eluate fractions were checked with SDS-PAGE and relevant fractions were pooled.Using 7 kDa molecular weight cut off (MWCO) dialysis tubing, pooled fractions were dialyzed for three buffer changes of at least 6 h for each volume of refolding buffer at 4 C (7 M urea, 20 mM Na Acetate pH 5.2, 200 mM NaCl, 1 mM EDTA, and 5 mM DTT).Following centrifugation to remove any insoluble material, the supernatant was dialyzed (7 kDa MWCO) in refolding buffer with decreasing amounts of urea: 1 h 6 M urea, 2h 4 M, 2h 2 M, and 1 h in size exclusion chromatography (SEC) buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM EDTA, 5% glycerol).Eluate was centrifuged to remove any precipitate before loading onto HiLoad 16/60 Superdex 200 SEC column (GE Healthcare).
The coding sequence for full-length Oct4 from M. musculus was cloned into the pOPIN expression vector using the SLIC method and Phusion Flash High-Fidelity PCR Master Mix (Finnzymes/New England Biolabs).SLIC reactions were then transformed into One ShotÔ OmniMACÔ 2 T1Ò Chemically Competent E. coli (ThermoFisher Scientific).After sequencing, the pOPIN-cHis-Oct4 construct was co-transfected with flashBACULTRAÔ bacmid DNA (Oxford Expression Technologies) into Sf9 cells (ThermoFisher Scientific) using Cellfectin IIÒ (ThermoFisher Scientific) to generate recombinant baculovirus.Mid-log phase Sf9 cells were used to amplify the virus.Suspension High FiveÔ cells were infected with P3 virus for two days at 27 C and 120 rpm shaking.After expression, crude lysates were purified on a HiTrap TALON column (GE Healthcare), cleaved on the column with 3C protease followed by size exclusion chromatography (HiLoad Superdex 200, GE Healthcare).The final product was collected in 25 mM HEPES pH 7.8, 150 mM NaCl, 1 mM TCEP, and 5% glycerol with $95% purity confirmed by SDS-PAGE.Fractions were checked with SDS-PAGE, pooled, and finally quantified using the NanoDrop spectrophotometer (ND-1000, ThermoFisher Scientific) and the Protein A 280 program using specific molecular weight and extinction coefficients for either Sox2 or Oct4.Unless otherwise indicated all chemicals were from Sigma-Aldrich.
Electrophoretic mobility shift assays (EMSAs) DNA probes were generated by annealing complementary 5' labeled Cy5 oligos (Metabion International AG) followed by purification from 10% polyacrylamide gels.EMSA DNA sequences can be found in Table S3.For binding reactions, WCL (2-4 ug of total protein) or purified proteins were incubated in binding buffer (25 mM HEPES-KOH pH 8, 50 mM NaCl, 0.5 mM EDTA, 0.07% Triton X-100, 4 mg/mL BSA, 7 mM DTT, and 10% glycerol) and 70 nM Cy5-dsDNA at 37 C for 1h.Samples were then loaded onto 6% native polyacrylamide gels (37.5/1 acrylamide/bis-acrylamide) containing 0.3x Tris-borate EDTA and 5% glycerol and run at 10 mA/gel in running buffer of the same composition.5% native gels were used for the compressed motif experiments to resolve the Sox17 monomer from the lower non-specific band of the HEK293T cells.
The WCL EMSAs throughout the manuscript were generated from lysates of HEK293T cells overexpressing our proteins of interest.The system was optimized by screening several different cell lines, promoters, and transfection condition combinations to find the optimal overexpressed protein to background band ratio.The intensity of the background bands varied between transfections and the protein being overexpressed.All EMSAs were adjusted for equal amounts of the overexpressed proteins being compared based on western blotting and monomer binding.WCL of untransfected 293T cells was added to reactions to equalize the total protein in each lane.
Gels were imaged using Fujifilm FLA-9000 fluorescence scanner using (Fujifilm).Fraction bound was determined by densitometry of raw data using Quantity OneÒ (v4.6.7,Bio-Rad) and the following equation for specific bands and then normalized: F B = DNA bound / (DNA bound + DNA unbound ).Half-life was calculated using fraction bound as a function of protein concentration from at least two independent experiments, error bars represent SD.
For competition experiments, pre-formed protein/DNA or protein/nucleosome complexes (see binding conditions above) were loaded onto native gels (t=0) and then incubated with unlabeled double stranded DNA containing the Nanog locus.Protein dissociation was monitored by removing aliquots of the reaction at the given time points and loading them onto a running gel.Protein complex stability was highly variable thus conditions for competition assays were determined empirically and can be found in Table S4.
Supershift assays were run under the same conditions as equilibrium or static EMSAs, see above.After incubation of the proteins with DNA for 1 hour at 37 C, antibody was added to the reactions and incubated at room temperature for 30 min.Antibody/total K R PM NA FM VW S R GQ R R KM AQ E N P K MH N S E I S K R L GA EWK L L S E T E K R P F I DE A K R L R A L HM K E H P DY K YR P R R KT K T L M K K D K

Figure 2 .
Figure 2. Molecular dynamic simulations reveal SL configuration of Sox2/Oct4 (A) Models of Sox2/Oct4 and Sox2 AV /Oct4 heterodimers on Oct4DE DNA in Oct4-specific SL configuration.The snapshots were captured from MDS in (B).(B and C) MDS of Sox/Oct heterodimers on Oct4DE DNA.Plots show the number of contacts (ligancy) between HMG 48 and POU (B), or with Oct4 I21 (C).Detailed in STAR Methods.(D) Models of Sox2/Oct6 and Sox2 AV /Oct6 heterodimers in S configuration on HoxB1 enhancer DNA.(E) Model of Sox2/Oct4 binding in DS configuration on Fgf4 motif.Only DNA-binding domains are shown in (A), (D), and (E).

Figure 3 .
Figure 3. Enhanced Sox/Oct cooperativity rescues non-functional POU factors in reprogramming (A-C) OSK reprogramming of OG2 MEFs with monocistronic retroviral vectors carrying Oct4 domain deletion of linker (A), NTD or CTD (B), and POU S or POU HD (C).(D) Western blot of whole-cell lysates from HEK293 used (E).(E) EMSAs with HEK293 lysates on the Nanog promoter and HoxB1 enhancer DNA.(F) Representative kinetic off-rate EMSAs with HEK293 lysates on Oct4DE, Nanog promoter, or Fgf4 enhancer DNA, asterisk = Sox/Oct/DNA.(G) Kinetic off-rate EMSAs with purified proteins on Utf1 enhancer and Nanog promoter DNA.t 1/2 = ternary complex half-life.White arrowheads indicate nonspecific bands (ns) and black arrowheads indicate free DNA or DNA bound by Oct4, Sox2, or the heterodimer.(H) Scheme showing the role of Sox/Oct dimerization in reprogramming.Data in (A)-(C), (F), and (G) represent mean ± SD; n = 3 biological replicates (A-C) or experiments (F and G); Student's t test in (A)-(C).

Figure 4 .
Figure 4. Highly cooperative Sox2 AV improves the developmental potential of mouse OSKM iPSCs (A) Kinetic off-rate EMSAs with purified Sox2 or Sox2 AV co-bound with Oct4 on 601-SHL + 6 SoxOct nucleosome.(B) Heatmaps and read pileup plots of ChIP-seq for MEF reprogramming samples at day 2 of Dox-induction.(C) Boxplots of ChIP-seq peaks for OKS and KS reprogramming samples.The midline indicates the median, boxes indicate the upper and lower quartiles, and the whiskers indicate 1.5 times interquartile range.p values calculated using the unpaired Wilcoxon rank sum test.(D) Fraction of binding sites containing SoxOct, MORE, both or none of the motifs in OKS reprogramming samples.(E) Genome browser track of ChIP-seq peaks for selected loci.(F) Heatmaps for ChIP-seq signals at the loci containing Sox2/Oct4 footprints in opened chromatin of mESCs versus MEFs, as determined by TOBIAS analysis of ATAC-seq data.58 (G) Percentage of 4N-aggregated all-iPSC embryos.Data points represent means for each clonal iPSC line.Scale bars represent the mean ± SEM between all lines generated with the same cocktail and delivery method.(H) Adult tetO-OS AV KM all-iPSC mice (9 months).(I) PCR genotyping of the progeny of all-iPSC mice derived from 3 tetO-OS AV KM iPSC lines.(J) Summary of (G)-(I).
and porcine fetal fibroblasts