HNF1A is a Novel Oncogene and Central Regulator of Pancreatic Cancer Stem Cells

The biological properties of pancreatic cancer stem cells (PCSCs) remain incompletely defined and the central regulators are unknown. By bioinformatic analysis of a PCSC-enriched gene signature, we identified the transcription factor HNF1A as a putative central regulator of PCSC function. Levels of HNF1A and its target genes were found to be elevated in PCSCs and tumorspheres, and depletion of HNF1A resulted in growth inhibition, apoptosis, impaired tumorsphere formation, PCSC depletion, and downregulation of OCT4 expression. Conversely, HNF1A overexpression increased PCSC numbers and tumorsphere formation in pancreatic cancer cells and drove PDA cell growth. Importantly, depletion of HNF1A in primary tumor xenografts impaired tumor growth and depleted PCSCs in vivo. Finally, we established an HNF1A-dependent gene signature in PDA cells that significantly correlated with reduced survivability in patients. These findings identify HNF1A as a central transcriptional regulator of the PCSC state and novel oncogene in pancreatic ductal adenocarcinoma.


INTRODUCTION
Pancreatic ductal adenocarcinoma (PDA) is projected to be the 2 nd leading cause of cancer deaths in the U.S. by 2020 (Rahib et al., 2014). The exceeding lethality of PDA is attributed to a complex of qualities frequent to the disease including early and aggressive metastasis and limited responsiveness to current standards of care. While both aspects are in-and-of-themselves multifaceted and can be partially attributed to factors such as the tumor microenvironment (Olive et al., 2009, Provenzano et al., 2012, Waghray et al., 2016 and the mutational profile of the tumor cells (Yachida et al., 2012), cancer stem cells (CSCs) have also been identified to 3 contribute to promoting early metastasis and resistance to therapeutics (Hermann et al., 2007, Li et al., 2011. CSCs, which were originally identified in leukemias (Bonnet andDick, 1997, Graham et al., 2002), have been identified in a number of solid tumors including glioblastoma (Singh et al., 2003), pancreas (Li et al., 2007, Hermann et al., 2007 and colon (O'Brien et al., 2007). In these cases, CSCs have been characterized by the ability to establish disease in immunocompromised mice, to resist chemotherapeutics, the capability of both self-renewal and differentiation into the full complement of heterogeneous neoplastic cells that comprise the tumor, and the propensity to metastasize. In each case, CSCs are distinguished from other tumor cell types by the expression of various, sometimes divergent cell surface markers. Our lab was the first to identify pancreatic cancer stem cells (PCSCs), which were found to express the markers EPCAM (ESA), CD44, and CD24 (Li et al., 2007). In addition to these markers, CD133 (Hermann et al., 2007), CXCR4 (Hermann et al., 2007), c-MET (Li et al., 2011), aldehyde dehydrogenase 1 (ALDH1) , and autofluorescence (Miranda-Lorenzo et al., 2014) have all been proposed markers of PCSCs. In all cases, the identified cells are characterized by being able to form spheres of cells (tumorspheres) under non-adherent, serum-free conditions, as well as an increased ability to form tumors in mice compared to bulk tumor cells. While a number of markers have been identified for PCSCs, relatively little is known about the transcriptional platforms that govern their function and set them apart from the majority of bulk PDA cells. Transcriptional regulators such NOTCH , Abel et al., 2014, BMI1 (Proctor et al., 2013), and SOX2 (Herreros-Villanueva et al., 2013) have been demonstrated to play roles in PCSCs, although these proteins are also critical for normal stem cell function in many tissues. 4 In this study we sought to better understand the biological heterogeneity of PCSCs and their bulk cell counterparts in an effort to identify novel regulators of PCSCs in the context of low-passage, primary patient-derived PDA cells. Using microarray analysis and comparing primary PDA cell subpopulations with different levels tumorigenic potential and stem cell-like function, we identified hepatocyte nuclear factor 1-alpha (HNF1A), an endoderm-restricted transcription factor, as a key regulator of the PCSC state. Supporting this hypothesis, depletion of HNF1A resulted in a loss of PCSC numbers and functionality both in vitro and in vivo.
Additionally, ectopic expression of HNF1A augmented PCSC properties in PDA cells and enhanced growth and anchorage-independence in normal pancreatic cell lines. Mechanistically, we found that HNF1A regulates the stem cell transcription factor OCT4 (POU5F1), which is necessary for stemness in PCSCs. Based on these data we postulate a novel pro-oncogenic function for HNF1A through its maintenance of the pancreatic cancer stem cell state.

An HNF1A gene signature dominates a PCSC gene signature
A transcriptional profile of PCSCs has yet to be established, and we hypothesized that such a profile would contain key regulators of the PCSC state. To pursue this hypothesis, we utilized a series of low-passage, patient derived PDA cell lines to isolate PCSC-enriching and nonenriching subpopulations for comparative analysis. Using two of our previously described PCSC surface markers, CD44 and EPCAM (Li et al., 2007), we found that low-passage PDA cells generally formed three subpopulations (abbreviated P herein) based on surface staining: CD44 High /EPCAM Low (P1), CD44 High /EPCAM High (P2), or CD44 Low /EPCAM High (P3) (Supplementary Figure 1A). Similar expression patterns were observed in 10 primary tumor 5 samples (data not shown). Using previously described measures of PCSC function (Li et al., 2007, Li et al., 2011, including co-expression of the PCSC marker CD24 (Supplementary Figure   1B), the ability for isolated subpopulations to reestablish cellular heterogeneity (Supplementary Figure 1C), the ability to form tumorspheres under non-adherent/serum-free culture conditions (Supplementary Figure 1D), and initiate tumors in immune-deficient mice (Supplementary Table   1), we found that P2 cells showed greater enrichment for cells with PCSC properties than their P1 and P3 counterparts.
Using 2 primary PDA lines (NY8 and NY15), P1, P2, and P3 PDA cells were sorted by flow cytometry, prepped immediately for mRNA, and analyzed by Affymetrix GeneChip microarray and validated by quantitative RT-PCR. We found that P2 cells from both lines exhibited a signature of 50 genes that was upregulated (>1.5 fold) relative to both P1 and P3 cell counterparts ( Figure 1A). To further refine this gene cohort, we utilized oPOSSUM (Kwon et al., 2012), a web-based system to detect overrepresented transcription factor binding sites in gene sets. Interestingly, HNF1A, a P2 cohort gene itself ( Figure 1B), had predicted binding sites in the ±5000 regions (from start of transcription) of 17/50 of the enriched genes, and due to its stringent consensus sequence (DGTTAATNATTAAC) was the most highly ranked common transcription factor by Z-score (17.895). Of these 50 genes, HNF1A is known to positively regulate cohort genes HNF4A (Boj et al., 2001), NR5A2 (Molero et al., 2012), CDH17 (Zhu et al., 2010), IGFBP1 (Babajko et al., 1993, Powell andSuwanichkul, 1993), and DPP4 (Gu et al., 2008).
Interestingly, genome-wide association (GWA) studies have recently identified certain single nucleotide polymorphisms (SNPs) in the HNF1A locus as risk factors for developing PDA (Pierce and Ahsan, 2011b, Li et al., 2012, Wei et al., 2012, although the mechanism by which these SNPs exert their influence is currently unknown. To further support the enrichment of 6 HNF1A in PCSCs, sorted cells were western blotted for HNF1A and one of HNF1A's target proteins, CDH17. Both proteins were found to be elevated in P2 cell lysates compared to other subpopulations ( Figure 1C), in agreement with their transcript levels. Additionally, surface expression of DPP4 (CD26) was also found to be highest on cells in the P2 subpopulation ( Figure 1D). CSCs are enriched in cancer cell populations grown under low-attachment tumorsphere (S) conditions compared to cells grown in adherent (A) conditions. In keeping with this observation, we found protein levels of HNF1A and CDH17 elevated in multiple PDA lines  Figure 2C). This construct showed excellent dependence on HNF1A expression as targeting HNF1A with an HNF1A-specific siRNA ablated expression of both the ectopic GFP and endogenous CDH17 (Supplementary Figure 2D). Lastly, we found the frequency of GFP positive cells increased in cells grown in suspension, with GFP expression being highest in the P2 subpopulation of NY15 cells (Supplementary Figure 2E). Based on our gene expression and tumorsphere data, we hypothesized that HNF1A is a central regulator of CSC function.

HNF1A is a critical regulator of CSC function in PDA cells
Consistent with our hypothesis that HNF1A may be an integral component of PDA biology we observed higher levels of HNF1A protein and transcripts in PDA cells compared to non- in PDA cells, we depleted the protein with two distinct siRNAs ( Figure 2B). Knockdown of HNF1A resulted in reduced cell numbers in several primary PDA lines ( Figure 2C).
Interestingly, knockdown of HNF1A resulted in more profound growth inhibition in cells with moderate to high HNF1A expression (NY5, NY8, NY15) than cells with low HNF1A expression (NY90) (Figure 2A, C; Supplementary Figure 3A, B). To determine whether the apparent loss in cell number was due to apoptotic cell death, we performed annexin V/DAPI staining on control and HNF1A-depleted NY5, NY8, and NY15 cells. In all cases, knockdown of HNF1A resulted in a significant (p<0.05) increase in early and late apoptotic cells, while not affecting necrotic cell numbers ( Figure 2D). Furthermore, increased cleavage of caspases 3, 6, 7, and 9 was observed in cells depleted of HNF1A ( Figure 2E), indicating apoptotic cell death. These data indicate that HNF1A is important for PDA cell growth/survival and basal expression levels predict response to targeting.
Next we pursued whether depletion of HNF1A impacted PDA subpopulation distribution.
Consistent with a central role in maintaining PDA cell heterogeneity, we consistently found that knockdown of HNF1A in multiple PDA lines resulted in loss of P2 cells ( Figure 3A), supporting a role for HNF1A in maintaining PDA cell heterogeneity. In addition to a decrease in CD44 surface expression, we also observed a marked decrease in CD24 surface expression ( Figure 3B, C) and mRNA levels (data not shown); suggesting that loss of HNF1A depletes the CSC compartment. Knockdown of HNF1A did not change surface expression levels of EPCAM ( Figure 3A, data not shown). To assess functional consequences of HNF1A-depletion on the PCSC compartment, cells (NY8, NY15) expressing HNF1A shRNAs were grown under tumorsphere-promoting conditions. These shRNAs effectively depleted HNF1A as well as 8 CDH17 ( Figure 3D), indicating downstream signaling inhibition. Consistent with a role in PCSC function, HNF1A knockdown showed a marked reduction in tumorsphere formation (p<0.05) ( Figure 3E, F).

HNF1A exhibits oncogenic properties in pancreatic cells
We next sought to determine whether CSC properties could be augmented by ectopic expression of HNF1A in PDA cells. Using doxycycline-inducible expression of HNF1A ( Figure 4A, B), we noted increased expression of CD24, CD44, and EPCAM in multiple primary PDA lines ( Figure   4B-D, data not shown). To determine whether HNF1A could drive tumorsphere formation, using NY15 and NY53 (a model for low endogenous HNF1A expression) we found that HNF1Aexpressing cells formed ~2.5 fold more tumorspheres than their counterparts ( Figure 4E). Similar results were also seen in moderate-HNF1A expressing cells NY8 (data not shown).
We next examined the effects of ectopic HNF1A expression in the non-tumorigenic pancreatic ductal cell lines HPDE and HPNE, which were devoid of endogenous HNF1A expression ( Figure 2A). Doxycycline-inducible ectopic expression of HNF1A alone or in concert with ectopic KRAS G12D was readily achieved in HPDE cells (Supplementary Figure 4A). Consistent with previous reports, KRAS G12D induced phosphorylation of both ERK1/2 and AKT in HPDE cells. Similar effects were seen in HPNE cells constitutively expressing HNF1A and KRAS G12D alone or in combination (Supplementary Figure 4A). We then tested the impact the of HNF1A and/or KRAS G12D expression, either alone or in combination, on HPDE cell growth.
Under normal growth conditions with serum, (LacZ) HPDE cells grew to confluency but did not form colonies, presumably due to contact-inhibition (Supplementary Figure 4B). Expression of KRAS G12D , however, resulted in colony formation, indicating a bypass of contact inhibition.
HNF1A alone resulted in significantly increased colony formation, which was further enhanced 9 by the additional expression of KRAS G12D . Similar effects were seen in HPNE cells (data not shown). In clonogenicity assays, HNF1A-expressing HPNE cells formed similar numbers of colonies to control and KRAS G12D -expressing cells ( Figure  was potently induced upon HNF1A expression, with nearly 84% of HPNE cells expressing CD24 compared to 0.5% of LacZ-expressing control cells. These data would suggest that HNF1A possesses properties of an oncogene capable of cooperation with oncogenic KRAS.

HNF1A is required for tumor growth and cancer stem cells maintenance in vivo
To determine whether HNF1A was necessary for tumorigenesis, we implanted two primary lines (NY5 and NY15) expressing control or two HNF1A-targeting shRNAs orthotopically in the pancreas of NOD/SCID mice. HNF1A-depleted cells showed significantly reduced tumor growth 10 compared to their control cohorts (p<0.05), ( Figure 5A, B). Similar results were observed with HNF1A knockdown in subcutaneous xenografts of NY 5 and NY15 cells ( Figure Figure 5C), the latter suggesting a shift to a more differentiated tumor histology.

HNF1A regulates stemness through OCT4 expression
As a direct relationship between HNF1A and stem cell function has not been reported, we regulation was a key event in HNF1A-dependent stemness, we targeted OCT4 with multiple siRNA, either in combination or as single sequences. Depletion of OCT4 resulted in a 11 pronounced inhibition of tumorsphere formation, comparable to HNF1A knockdown ( Figure 6C-G). Importantly, knockdown of either HNF1A or OCT4 had comparable effects on the protein levels of OCT4A ( Figure 6C), the isoform responsible for imparting stemness (Lee et al., 2006).
To determine whether expression of OCT4A was sufficient to rescue stemness of PDA cells depleted of HNF1A, NY8 and NY15 cells were transduced with OCT4A-expressing lentiviruses or vector controls and transfected with HNF1A siRNA. Consistent with our previous results, loss of HNF1A impaired tumorsphere formation in both lines expressing the vector control, however, this effect was overcome by the expression of OCT4A ( Figure 6H, Supplementary Figure 6A, B). These data indicate that HNF1A mediates stemness of PCSCs through regulation of OCT4.

An HNF1A gene signature is associated with poor survival in PDA patients
Lastly we sought to gain insight into the transcriptional activity of HNF1A in PDA and determine whether its transcriptome held prognostic information similar to other signatures in PDA (Bailey et al., 2016, Collisson et al., 2011. In order to identify transcriptional targets of HNF1A, we performed Bru-seq, a variation of RNA-seq which measures changes in nascent RNA levels (bone fide transcription rate) as opposed to steady-state RNA changes measured by conventional RNA-seq and microarray (Paulsen et al., 2013). Using concomitant ChIP-seq from control and HNF1A depleted NY8 and NY15 cells using an HNF1A-specific antibody, we identified 243 HNF1A-activated and 46 HNF1A-repressed transcripts shared between NY8 and NY15 ( Figure 7A). 139/239 (57.2%) and 11/46 (23.9%) HNF1A-activated/repressed genes showed detectable HNF1A binding ( Figure 7B), either distal or proximal to the transcriptional start site, supporting the role of HNF1A as a transcriptional activator. Importantly, a number of known HNF1A target genes exhibited HNF1A promoter-proximal binding and transcriptional responsiveness via Bru-seq/ChIP-seq, including CDH17 ( Figure 7C). Additionally, the PCSC 12 marker EPCAM also showed HNF1A distal binding and transcriptional responsiveness, implicating HNF1A as a direct regulator of this gene. CD24, which showed changes in transcription in response to HNF1A loss, did not show direct binding, indicating an indirect mechanism of regulation (data not shown). POU5F1 transcription was found to decrease in both NY8 (34.3%) and NY15 (41.5%) cells, with weak enrichment of a putative regulatory region previously shown to bind HNF1A (Consortium, 2012, Malakootian et al., 2017) (data not shown). These data suggest that while POU5F1 transcription is promoted by HNF1A, the mode of regulation may be a combination of direct and indirect mechanisms.
To assess if the HNF1A-transcriptomic signature might serve as a prognostic tool for poor outcomes in pancreatic cancer patients, as has been observed with CSC signatures in other cancer types (Bartholdy et al., 2014, Eppert et al., 2011, Glinsky et al., 2005 Table   2). By contrast, only 2/237 (0.8%) of HNF1A-activated genes and 1/137 (0.7%) of HNF1Abound and activate genes were significantly associated with better survival outcomes.
Importantly, expression of HNF1A-activated genes (both HNF1A-bound and unbound) was more likely to be associated with poor survival outcome in patients with PDA than genes selected at random (p<0.05), as determined by a permutation test (N=10,000) (see insets in  Figure 6B, inset). Taken together, these data suggest that a HNF1A gene signature may predict poor outcomes in PDA.

DISCUSSION
In this study, we identified the transcription factor HNF1A as putative regulator of a PCSC gene signature. Functional studies revealed that HNF1A was not only central to the regulation of this gene signature, but also PCSC function. Depletion of HNF1A effectively inhibited PDA cell growth, tumorsphere formation, and tumor growth, with a loss of PCSC numbers observed both in vitro and in vivo. Mechanistically, HNF1A appears to promote stemness through positive regulation of pluripotency factor POU5F1/OCT4. Finally, we found that expression of HNF1Aactivated genes significantly predicted poor survival outcomes in patients with PDA. These data point to a novel oncogenic role for HNF1A in pancreatic cancer, particularly in PCSCs.
A clear role for HNF1A in PDA has not previously been established. An early study of the putative oncogene FGFR4, frequently expressed in PDA (Ohta et al., 1995), is directly regulated by HNF1A through intronic binding sites (Shah et al., 2002). More recently, 73% of PDA samples were found to stain positive for HNF1A (Kong et al., 2015). A more direct role for HNF1A in PDA has been suggested by multiple GWA studies implicating certain SNPs in HNF1A as risk factors for the development of PDA (Pierce and Ahsan, 2011a, Wei et al., 2012, Li et al., 2012. Nearly all of the identified HNF1A SNPs are non-coding and relatively common (minor allele frequencies between 30-40%), suggesting these SNPs may serve as potential contributing rather than driving factors in pancreatic tumorigenesis. Interestingly, PDA associated HNF1A SNPs rs7310409, rs1169300, and rs2464196 are also associated with both an elevated risk (1.5-2 fold) of developing lung cancer and elevated circulating C-reactive protein (CRP). A well-established direct target of HNF1A (Toniatti et al., 1990), CRP is downregulated in patients with inactivating mutations in HNF1A (Thanabalasingham et al., 2011). As several 14 PDA-associated SNPs are associated with elevated CRP, it is therefore possible that these SNPs augment the activity/expression of HNF1A rather than diminish it, as in the case of maturityonset diabetes of the young 3 (MODY3) variants which reduce or abolish HNF1A expression or function. Still, a tumor suppressive role for HNF1A in PDA has also been proposed (Hoskins et al., 2014, Luo et al., 2015. In these studies HNF1A was found to possess pro-apoptotic/antiproliferative properties contrary to the data in this study. Differences in these results may be technical in nature (control cells in Luo et al. exhibited unusually high baseline apoptosis approaching 50%), however it is also possible that the role of HNF1A may differ between different molecular subtypes of PDA (Bailey et al., 2016) or in a dynamic manner like fellow transcription factor PDX1 (Roy et al., 2016).
Our data on HPDE and HPNE cells support a partially transforming capacity for HNF1A, wherein it overcomes contact-inhibition and anchorage-dependent growth. As cooperation with oncogenic KRAS was observed in these cells, it is feasible that HNF1A provides additional and acetylated lysine 27 histone H3 supports the involvement of this region in the transcription of OCT4. As this region, rich in repetitive elements/retrotransposons, is not conserved between humans and rodents, it is possible the interaction between HNF1A and OCT4 is an acquisition of human evolution and may explain why OCT4 has not previously been identified as an HNF1A target. Our ChIP-seq data show a weak enrichment of HNF1A within this region and direct HNF1A ChIP shows a 2.5-4 fold enrichment in NY8 and NY15 cells. This lack of a robust identifiable interaction does not preclude direct regulation, but it suggests other indirect mechanisms may be just as important in regulation of OCT4 expression by HNF1A.
Given that 26 HNF1A-activated genes (17 direct targets) were found to be significantly associated with poor survival in patients with PDA, it is likely that multiple target genes contribute to HNF1A's oncogenic influence, and future studies should be done to assess the 16 functions of these genes in PDA to ascertain their value as either potential biomarkers or therapeutic targets. Further studies are also needed in regards to HNF1A's role in the exocrine pancreas and whether its function is redirected during the development of PDA, particularly under the influence of oncogenic KRAS. Overall, this study further validates the importance of HNF1A to PDA while providing a novel and critical role for HNF1A in driving pancreatic cancer stem cells.

Tumor growth assays
8-10 week old, evenly sex-mixed NOD/SCID mice were used for all experiments. Orthotopic implantation of PDA cells to the pancreas has previously been described (Abel et al., 2014).

Cell culture
Low-passage xenograft tumors were cut into small pieces with scissors and then minced completely using sterile scalpel blades. Single cells were obtained described previously (Li et al., 2007). The cells were cultured in RPMI-1640 with GlutaMAX™-I supplemented with 10% FBS (Gibco), 1% antibiotic-antimycotic (Gibco), and 100µg/ml gentamicin (Gibco). The cells used in this article are passaged less than 10 times in vitro. HPDE cells were a generous gift from Dr.
Craig Logsdon (MD Anderson) and were maintained in keratinocyte SFM supplemented (Invitrogen) with included EGF and bovine pituitary extract as well as 1% antibiotic-antimycotic and 100µg/ml gentamicin. All cells were routinely tested for mycoplasma contamination using the MycoScope PCR Detection kit (Genlantis, San Diego, CA) and only mycoplasma-free cells were used for experimentation.

Soft agar assays
Low-melting agarose (Invitrogen) was dissolved in serum-free RPMI-1640 with GlutaMAX™-I to a final concentration of 2% at 60ºC and cooled to 42ºC. 200 µL per well 2% agarose was evenly spread at the bottom of a 24-well dish, followed by 250 µL of 0.6% agarose (diluted with complete keratinocyte SFM and supplemented with FBS to 2.5%), a 250 µL of 0.4% agarose/cell suspension, and a 250 µL of acellular 0.4% agarose. Each layer was allowed to solidify a 4ºC for 10 minutes and then heated to 37ºC prior to adding the next layer. 500ul of complete keratinocyte SFM and supplemented with 2.5% FBS was added atop each gel and replenished every 3 days.

Flow cytometry 20
Flow cytometry was performed as described previously (Li et al., 2007). Cells were dissociated with 2.5% trypsin/EDTA solution, counted and transferred to 5 mL tubes, washed with HBSS supplemented with FBS twice and resuspended in HBSS/2% FBS at a concentration of 1 million cells/100 µL. Primary antibodies were diluted 1:40 in cell suspensions and incubated for 30 minutes on ice with occasional vortexing. Cells were washed twice with HBSS/2% FBS and incubated for 20 minutes on ice with APC-Cy7 Streptavidin diluted 1:200. Cells were washed twice with HBSS/2% FBS and resuspended in HBSS/2%FBS containing 3 µM 4',6-diamidino-2phenylindole (DAPI) (Invitrogen, Carlsbad, CA). Flow cytometry and sorting was done using a FACSAria (BD Biosciences, Franklin Lakes, NJ). Side scatter and forward scatter profiles were used to eliminate cell doublets, APC-Cy7 was used to exclude mouse cells. For PatGFP-Luc2 labeling, GFP+/DAPI-cells were isolated by sorting and expanded for one passage prior to implantation. For analysis of apoptosis, APC-conjugated Annexin V and Annexin V binding buffer (BD Biosciences) was used following manufacturer's recommendations with 3 µM DAPI added immediately before analysis to stain permeable cells/necrotic debris.

Microarray analysis
Flow sorted NY8 and NY15 P1, P2, and P3 cells were immediately used for RNA isolation using the RNeasy Plus Mini Kit coupled with RNase-free DNase set (Qiagen). Microarrays and analyses were performed by the University of Michigan DNA Sequencing Core. RNA labeling and hybridization was conducted using the Human Genome U133 Plus 2.0 microarray (Affymetrix, Santa Clara, CA). Probe signals were normalized and corrected according to background signal. Adjusted signal strength was used to generate quantitative raw values, which were log-transformed for all subsequent analyses. Single Site Analysis (SSA) for human was used to detect over-represented conserved transcription factor binding sites in the 50 PCSC-enriched genes. The program was run using default settings, which included a conservation cutoff of 0.4, a matrix score threshold of 85%, and search region of 5,000 basepairs upstream and downstream of the start of transcription. The query was entered against a background of 24,752 genes in the oPOSSUM database.

Quantitative reverse transcription-PCR (qRT-PCR)
Total RNA was extracted using RNeasy Plus Mini Kit coupled with RNase-free DNase set (Qiagen) and reverse transcribed with High Capacity RNA-to-cDNA Master Mix (Applied Biosystem). The resulting cDNAs were used for PCR using Power SYBR ® Green PCR Master Mix (Applied Biosystem) in triplicates. qPCR and data collection were performed on a ViiA™7 Real-Time PCR system (Invitrogen). Conditions used for qPCR were 95°C hold for 10 mins, 40 cycles of 95°C for 10 secs, 60°C for 15 secs, and 72°C for 20 secs. All quantitations were normalized to an endogenous control ACTB. The relative quantitation value for each target gene compared to the calibrator for that target is expressed as 2-(Ct-Cc) (Ct and Cc are the mean threshold cycle differences after normalizing to ACTB).

Chromatin immunoprecipitation sequencing (ChIP-seq)
A confluent 15cm culture plate of cells was used per immunoprecipitation. Cells were fixed with 1% formaldehyde for 10 minutes. Nuclei were collected and chromatin sheared to 1-10 23 nucleosomes using the SimpleChIP Plus Enzymatic Chromatin IP kit and protocol (Cell Signaling). HNF1A was immunoprecipitated with goat polyclonal antibody C-19 (Santa Cruz).
Libraries from HNF1A-immunoprecipitated chromatin and input chromatin was prepared by the University of Michigan Sequencing Core and sequenced on the Illumina HiSeq 4000.

Strand-specific libraries were made using the Illumina TruSeq kit and sequenced on the Illumina
HiSeq 4000 platform at the University of Michigan Sequencing Core (Ann Arbor, MI). Genes were recognized as differentially expressed in both cell lines if the fold change after knockdown was greater than 1.5 (and FDR <0.1 in NY15) and the mean RPKM for a given comparison was greater than 0.25 in either HNF1A shRNA#1 or shRNA#2 per cell line.

Mapping and analysis of ChIP-seq and Bru-seq
For ChIP-seq, 52-base, single end reads were aligned to the human reference genome (hg19) using Bowtie v1.1.1 (with options: -n 3 -k 1 -m 1). Peaks were called using MACS v1.4.2 using the default options. MACS peaks overlapping ENCODE blacklist regions were removed (https://www.encodeproject.org/annotations/ENCSR636HFF). Each peak was assigned to the closest gene's transcription start site (TSS). Then, for each TSS, the distance to the nearest peak was measured. If the nearest associated peak was within +/-5 kb of the TSS, it was considered proximal. In the absence of a proximal peak, the nearest associated peak within +/-100 kb of the 24 TSS was considered distal. A gene was recognized as having a proximal or distal peak if at least one replicate in both cell lines identified a proximal or distal peak. If a gene was found to have both proximal and distal peaks (usually due to differences between replicates), the gene was identified as distal if it had distal peaks in both replicates of both cell lines, otherwise it was identified as neither. For Bru-seq, 52-base, single end reads were aligned first to ribosomal DNA (U13369.1) using Bowtie v0.12.8 and the remaining reads aligned to the human reference genome (hg19) using TopHat v1.4.1. Differential gene expression analysis was performed using DESeq v1.24.0 (R v3.3.1). ChIP-seq and Bru-seq data from this study are available at the NCBI Gene Expression Omnibus (GEO; accession # GSE108151).

TCGA survival analysis
Gene expression and patient survival data for PAAD were obtained through the Broad Institute RSEM genes (normalized) data set and log 10 -transformed prior to analysis. Samples identified as tumors and of non-neuroendocrine origin were used. Genes were selected based on Bru-seq and/or ChIP-seq as per above and linked (where possible) to TCGA expression data through Entrez ID or HUGO gene symbol. Survival analysis, including Kaplan-Meier estimate and logrank test, was performed using the R package survival (v2.40-1). Patients were stratified as "high" and "low" based on the median gene expression. For genes within a given set, the fraction of genes associated with reduced (or increased) survival with p < 0.05 was calculated.
For each selected set of genes (e.g. HNF1A-activated genes), a permutation test (N=10,000) was performed using randomly-selected sets of genes expressed in NY8 and NY15 cells. The 25 fraction of genes significantly associated with reduced (or increased) survival was calculated (as above) and the resulting null distribution used to test significance of the given selected set of genes. For N=10,000, the estimated error at p=0.05 is ±0.0034 (less than 10%).

Other statistical analysis
The following methods are specific to analysis of the data represented in Figure 1-6 and Supplementary Figure 1, 3, and 5. Data are expressed as the mean ± SEM. Statistically significant differences between two groups was determined by the two-sided Student t-test for continuous data, while ANOVA was used for comparisons between multiple groups.
Significance was defined as P < 0.05. GraphPad Prism 6 was used for these analyses.

Study approval
All animal protocols were approved by University Committee for the Use and Care of Animals  Heatmap representing relative fold differences in qRT-PCR expression of 50 cancer stem cell-enriched genes in NY8 and NY15 cells. Per-gene values are relative to P1 or P3, whichever is higher. Gene names in red text indicate predicted HNF1A targets and asterisks (*) indicate known HNF1A targets. P1: EPCAM Low /CD44 High , P2: EPCAM High /CD44 High , P3: EPCAM High /CD44 Low . For all genes, expression levels were normalized to an ACTB mRNA control, n=3 biological replicates. Only genes with a significant (p<0.05) increase in P2 over both P1 and P3 subpopulations are shown. (B) qRT-PCR analysis of HNF1A mRNA expression, normalized to an ACTB mRNA control, from different primary PDA subpopulations (n=3 biological replicates). Statistical difference was determined by one-way ANOVA with Tukey's multiple comparisons test; ***p<0.001, ****p<0.001. (C) Western blot analysis of HNF1A and target gene CDH17, as well as CD44 and EPCAM as subpopulation controls in sorted PDA cells. (D) Co-localization of HNF1A target gene DPP4 surface expression with EPCAM and CD44 expression is highlighted in red (moderate expression) and yellow (high expression). Related data can be found in Supplementary Figures 1 and 2 1.5x10 5 PDA cells were transfected with control (Ctl) or HNF1A-targeting siRNA. Cells were collected and manually counted 3 and 6 days after transfection (n=3 biological replicates). Statistical difference was determined by one-way ANOVA with Dunnett's multiple comparisons test. Red and green p values indicate Ctl vs. HNF1A#1 or #2, respectively. (D) Annexin V/DAPI staining was performed on NY5, NY8, and NY15 cells transfected with control (Ctl) or HNF1A-targeting siRNA (H1, H2) for 3 days. The amount of viable (annexin V-/DAPI-), early apoptotic (annexin V+/DAPI-), late apoptotic (annexin V+/DAPI+), and necrotic (annexin V-/DAPI+) cells are quantitated (n=4 biological replicates). Statistical difference was determined by one-way ANOVA with Dunnett's multiple comparisons test, with p values indicated to the right of each subpopulation relative to the control subpopulations. (E) Western blot analysis of cleaved caspases in NY8 and NY15 cells following HNF1A-knockdown (3 days). Actin serves as a loading control. Related data can be found in Supplementary Figure 3. Representative CD24 and EPCAM surface staining in cells following HNF1A knockdown (6 days) (C) Quantitation of CD24+ cells in multiple primary PDA cells following HNF1A knockdown, n=4 biological replicates. Statistical difference was determined by one-way ANOVA with Dunnett's multiple comparisons test. (D) NY8, NY15, and NY5 cells expressing LacZ2.1 (L) or two distinct HNF1A-targeting shRNAs (H1 and H2) were lysed and western blotted for HNF1A, CDH17, and Actin, showing effective knockdown of HNF1A and downstream signaling (CDH17). (E, F) NY8 and NY15 cells expressing LacZ2.1 or HNF1Atargeting shRNAs were grown in tumorsphere media on non-adherent plates (1500 cells/well). The number of tumorspheres formed after 6 days were counted (n=3 biological replicates). Representative images of spheres (100X magnification) are shown in (F) with quantitation in (E). Statistical difference was determined by one-way ANOVA with Dunnett's multiple comparisons test.