Nonmonotone invasion landscape by noise-aware control of metastasis activator levels

A major pharmacological assumption is that lowering disease-promoting protein levels is generally beneficial. For example, inhibiting metastasis activator BACH1 is proposed to decrease cancer metastases. Testing such assumptions requires approaches to measure disease phenotypes while precisely adjusting disease-promoting protein levels. Here we developed a two-step strategy to integrate protein-level tuning, noise-aware synthetic gene circuits into a well-defined human genomic safe harbor locus. Unexpectedly, engineered MDA-MB-231 metastatic human breast cancer cells become more, then less and then more invasive as we tune BACH1 levels up, irrespective of the native BACH1. BACH1 expression shifts in invading cells, and expression of BACH1ʼs transcriptional targets confirm BACH1ʼs nonmonotone phenotypic and regulatory effects. Thus, chemical inhibition of BACH1 could have unwanted effects on invasion. Additionally, BACH1ʼs expression variability aids invasion at high BACH1 expression. Overall, precisely engineered, noise-aware protein-level control is necessary and important to unravel disease effects of genes to improve clinical drug efficacy.

Article https://doi.org/10.1038/s41589-023-01344-z As a step toward quantitatively unraveling BACH1ʼs role as a metastasis regulator in TNBC, we developed a generally applicable, two-step genome engineering strategy (Fig. 1c) to integrate synthetic gene circuits into the AAVS1 (adeno-associated virus integration site 1) SHS of any human cell line. Using this method, we established clones with tunable mean and variance of BACH1 levels from the MDA-MB-231 TNBC cell line. Unexpectedly, we discovered a nonmonotone invasion landscape, with invasion increasing, decreasing and then increasing again as mean BACH1 levels increase in such cells in vitro. We confirmed this nonmonotone relationship by examining BACH1 expression distributions of invading versus seeded cells, which indicate directional or disruptive selection. Additionally, BACH1 expression noise aids invasion but only at high BACH1 expression. Moreover, the expression of multiple BACH1 transcriptional targets confirm nonmonotone BACH1 regulatory effects, with additional support from TNBC clinical samples and cell line data. Homozygous BACH1 deletion alters, but does not eliminate, the nonmonotone effects, excluding native BACH1 as the sole cause of nonmonotonicity. Taken together, we uncovered nonmonotone effects of BACH1 in TNBC cells, and here we demonstrate the need and synthetic biology-based possibility of phenotypic landscape mapping to quantitatively understand and control the complex effects of clinically relevant proteins.

Two-step strategy for robust SHS-specific gene circuit integration
To develop a strategy for reliable single-copy gene circuit integration into AAVS1 (ref. 49), which is the most prominent human SHS, we designed a two-step technical pipeline (Fig. 1c). First, we introduce an FRT landing pad (LP) into AAVS1 by generating a double-strand break with CRISPR-Cas9. Second, we flip various gene circuits into the LP by homologous recombination without additional doublestrand breaks 50 .
Second, we integrated a mammalian negative feedback (mNF) synthetic gene circuit into LP clones MB231-1 and 293-3 by Flp 50 recombinase-mediated cassette exchange (RMCE) 51 without double-strand breaks 52 . These mNF gene circuits 53 (Fig. 1d) rely on a self-controlling tetracycline repressor (TetR) for doxycycline (dox)-dependent, low-noise tuning of either the eGFP reporter (mNF-GFP) or a bifunctional 54 eGFP::BACH1 fusion (mNF-BACH1). Using the low uninduced expression of mNF-integrant cells (Extended Data Fig. 1e), we enriched for these cells by double sorting, first GFP low cells without dox and then GFP high cells with 100 ng ml −1 dox induction (Supplementary Fig. 1b), followed by single-cell bottlenecking. Finally, we confirmed single-copy AAVS1-and LP-specific integration of the mNF gene circuits without random insertions even in polyclonal samples (Extended Data Fig. 1f-h).
In conclusion, through SSR/SHS, we generated and validated multiple monoclonal cell lines derived from 293-3 and MB231-1 LP clones, with either the mNF-GFP or the mNF-BACH1 gene circuit integrated into the LP in the AAVS1 locus. Such precisely engineered cell lines are fundamentally necessary for quantitatively tuning target gene expression and studying the corresponding disease-related phenotypic landscapes. levels control single cells and cell populations, tools that can precisely tune not only protein levels but also protein variability in living cells are needed.
Breast cancer is still a leading cause of mortality in women. Laboratory studies have benefitted from cell lines 21 isolated from mammary tumors 22,23 that are either mostly amenable for targeted therapy or multidrug-resistant, metastatic triple-negative breast cancers (TNBCs) 24 . Unlike for most primary tumors, the mutational bases of metastases are complex and ill-defined 25 . Instead, metastatic cells have altered cell population-level features, such as heterogeneous morphologies and transcriptomes 6,26 , with gene expression mean or heterogeneity perturbed by specific transcription factors, such as BTB and CNC homology 1 (BACH1). BACH1 is a metastasis-activator transcription factor 27,28 that represses its own transcription as well as that of metastasis suppressor Raf kinase inhibitor protein (RKIP), also known as PE-binding protein-1 (PEBP1) (ref. 29) (Fig. 1a). Additionally, BACH1 activates multiple metastasis effector proteins such as CXCR4 and matrix metalloproteinase MMP1 (ref. 30). To design effective therapies targeting BACH1 in TNBC 31,32 , the phenotypic and regulatory consequences ('landscape'; Fig. 1b) of altering both the mean and heterogeneity of cellular BACH1 levels should be established. For example, would metastatic behaviors always intensify with increasing BACH1 levels, as typically assumed for metastasis activators? Would increasing BACH1 levels always lower RKIP levels? What would happen upon altering BACH1 variance besides its average? Although these crucial questions are still open, therapeutic BACH1 inhibition is being suggested to diminish metastasis 31,32 , by the unverified paradigm that inhibiting oncogenes or other disease-promoting genes is generally beneficial-that is, their effect landscapes are monotone (Fig. 1b). However, such assumptions are risky without quantitative cellular phenotyping 33 versus the fine-tuned mean and variance of drug-target protein levels 34,35 , initially in vitro, which necessitates genetic tools currently lacking in human cells.
Studying the phenotypic effects of proteins has benefitted from transient (plasmids) or stable (gene deletion or transgene integration) knockout/knockdown or overexpression approaches. However, all these perturbations suffer from two shortcomings. The first shortcoming is their qualitative character, which cannot define quantitatively either the ups and downs of regulatory and phenotypic effects at intermediate protein levels or the effects of protein-level variability. Synthetic biology is addressing this challenge using gene circuits to fine-tune protein-level mean and variability by chemicals or light [36][37][38] . Similar tools have generated cellular fitness landscapes in bacteria 39 and yeast 40 but not yet in mammalian studies. Alternatively, CRISPR methods based on DNA-binding mutated Cas 41 or RNA-cleaving Cas variants 42 have created either transient or randomly integrated transcript-controlling tools but without adjusting the variability. The second shortcoming is that genomic integration of expression control tools-for example, by lentiviral transduction 43 or CRISPR-Cas 44can trigger unwanted, potentially risky off-target genetic and epigenetic alterations. Conversely, native molecular mechanisms, such as epigenetic silencing, can compromise transgene expression in human cells 45 . To overcome these drawbacks, site-specific recombinases (SSRs) targeting human safe harbor sites (SHSs)-that is, SSR/ SHS-can minimize the mutual interference between transgenes and host cell 46 because (1) SSR only interacts with its own recognition sites 47 ; (2) altering SHS genetic sequences does not disrupt cell functions 46 ; and (3) genes inserted into SHSs are well expressed, with modest silencing 46 . Developed and standardized in bacteria and yeast 48 , SSR/SHS is mostly restricted to a few commercial mammalian cell lines carrying SSR recognition sites in unknown genomic SHS loci 37 . Overall, combining synthetic gene circuit engineering with SSR/SHS could address both shortcomings, enabling disease-related fitness landscape mapping while adjusting both the mean and variance of protein levels.

Fluorescence dose responses of LP-integrated mNF gene circuits
Phenotypic landscape mapping requires dose-response measurements to characterize gene expression tunability. To test mNF's expression control capability, we first screened multiple mNF-BACH1 and mNF-GFP clones by measuring the fold change and variance of eGFP expression. Despite their identical descent and genetic background before gene circuit insertion, clones had substantial differences both in the expression fold change and cell-cell variability or noise, measured by the coefficient of variation (CV). We chose from each cell type one low-noise and one high-noise mNF clone with maximum noise difference but relatively similar fold change of mean expression (Extended Data Fig. 2a) for detailed dose-response characterization and subsequent phenotypic landscape mapping. To determine the full dose responses of all eight mNF clones (MB231-based and 293-based, high-noise and low-noise, mNF-BACH1 and mNF-GFP), we next examined the means and CVs of their gene expression after 48 h in constant dox concentrations ranging between 0 and 100 ng ml −1 . Flow cytometry ( Fig. 2a and Extended Data Fig. 2b) and fluorescence microscopy (Supplementary Figs. 2 and 3) consistently indicated a monotone dox-dependent, up to 30-fold increase of mean eGFP fluorescence intensity for every low-noise and high-noise clone (Fig. 2b,c and Extended Data Fig. 2c,d). The expression distributions of high-noise clones were broader, and their CV difference from low-noise clones increased with dox concentration ( Fig. 2c and Extended Data Fig. 2d). These expression features of mNF clones were stable and reproducible in cell culture for up to 4 weeks (Extended Data Fig. 3a,b).
To later study how BACH1 expression noise impacts phenotypes, we sought to establish decoupled noise points with different gene expression noise but with similar mean expression 52,55 . Plotting the CV versus mean of expression revealed broad decoupled noise regimes for MB231 and 293 mNF clones (Extended Data Fig. 3c,d). Interestingly, the low-noise and high-noise MB231 mNF-BACH1 clones had significantly different CVs but nearly identical mean expression, making them suitable to test the phenotypic roles of noise independently of the mean.
To verify that the protein-level measurements reflect BACH1 mRNA expression, we examined BACH1 transcript levels via qRT-PCR. Total BACH1 mRNA levels increased in a monotone dox-dependent manner up to 11-fold in both low-noise and high-noise mNF-BACH1 293 and MB231 clones, and the eGFP and BACH1 mRNA levels correlated positively with slopes near 1 ( Fig. 2d and Extended Data Fig. 3e). In mNF-GFP clones, only eGFP transcript levels increased, without significant changes in transcript levels of BACH1 or its direct downstream regulatory target RKIP (Extended Data Fig. 3f-h).
In addition to the mNF-controlled, ectopic BACH1 copy, all clones contain native copies of the BACH1 gene. Using immunofluorescence to understand the dox dose response of overall BACH1 protein levels expressed from both native and ectopic BACH1 genes, we recapitulated monotone increases and noise differences for total BACH1 protein levels in all mNF-BACH1 293 and MB231 clones ( Fig. 2e and Extended Data Fig. 3i). In mNF-GFP cells, total BACH1 protein levels were statistically indistinguishable from those of LP parental cells ( Fig. 2f and Extended Data Fig. 4a), regardless of induction. In mNF-BACH1 cells, the eGFP reporter was an excellent indicator of BACH1 protein levels because its fluorescence correlated strongly with BACH1 immunofluorescence intensity (Extended Data Fig. 4b). Western blots further confirmed the monotone dox-dependent increase of ectopic BACH1 levels in the MB231 mNF-BACH1 clones and no change in mNF-GFP controls (Fig. 2g).
Moreover, we used hemin to test ubiquitin-mediated co-degradation of BACH1 (ref. 56) and eGFP reporter. Hemin caused a substantial reduction of eGFP fluorescence intensity in mNF-BACH1 cells (Extended Data Fig. 4c) but none in mNF-GFP cells. We also confirmed theoretical expectations of percent BACH1 reductions based on hemin-dependent but dox-independent rate constants of BACH1 degradation (Supplementary Notes 1.1 and Extended Data Fig. 4d). Finally, when we translationally separated BACH1 from eGFP in another SHS/ SSR-generated 293 mNF clone, eGFP intensity decreased minimally upon hemin treatment (Extended Data Fig. 4e).
Overall, these findings indicate that dose responses may differ across mNF clones but remain stable and highly reproducible over time within each clone, supporting mNF as a protein expression-controlling device. The eGFP reporter co-exists and co-degrades with BACH1 via protein fusion, so eGFP fluorescence accurately reports BACH1 protein levels in single cells. The precise BACH1 tuning device that we created can interface with and deliver signals into the native BACH1 regulatory network 29,30 (Fig. 1a), enabling quantitative exploration of phenotypic landscapes and network responses to tunable mean and variance of BACH1 levels.

Noise-aware control shows that BACH1 nonmonotonically regulates cell invasion
As a master regulator 57 , BACH1 plays diverse roles in regulating multiple signaling and metabolic pathways, including its cancer metastasis activator function in TNBC. What exactly does the term 'activator' mean about the effect of BACH1 protein levels on metastasis-a highly complex evolutionary process requiring many steps, including cell migration, invasion, intravasation and extravasation, dissemination, colonization and metastatic outgrowth, each of which occurs with poorly measurable, low probabilities in vivo 58 ? Owing to its complexity, quantitative studies of the entire metastatic process are currently unfeasible. Thus, to focus on a key aspect of BACH1ʼs metastasis activator function, we assayed invasion in vitro, which indicates metastatic potential 59 .
If BACH1 promotes invasion, then its reduction should reduce invasion. To confirm this in MB231 cells, we lowered BACH1 levels in mNF clones by various doses of hemin (Extended Data Fig. 4f). Boyden chamber (transwell) invasion assays 59 indicated that hemin reduced invasiveness (the ratio of invading versus originally seeded cells) over two-fold in both uninduced mNF-BACH1 and mNF-GFP MB231 clones ( Fig. 3a and Extended Data Fig. 4g), as seen with anti-BACH1 short hairpin RNA (shRNA) in 1,833 cells 30 . This indicates similar native BACH1 levels and negligible ectopic BACH1 contribution in all uninduced clones, consistent with the immunofluorescence measurements.

Fig. 4 | BACH1 expression profile shifts of invading cells indicate directional
and divergent selection. a, Flow cytometry expression profile changes of invading low-noise BACH1 cells versus seeded cells at multiple points of the invasion landscape (n = 3). b, MFI (top) and CV (bottom) comparison between invading cells and seeded cells (unpaired two-tailed t-test, n = 3, **P < 0.01, ****P < 0.0001). c, Flow cytometry expression profile comparisons for invading low-noise mNF-GFP cells versus seeded cells at three key dox levels. d, MFI (left) and CV (right) of invading versus seeded cells (unpaired two-tailed t-test, n = 3, P > 0.05). e, Cellular invasion landscape (dashed blue line) of low-noise mNF-BACH1 (BL) clone versus log 10 (mean BACH1 expression) inferred from flow cytometry histograms of seeded and invading cells. Stochastic simulation results (cyan) for cells invading according to their BACH1 level dependent position on the landscape are compared to the experimental data (blue). f, Computationally predicted log 10 (BACH1 expression) profile changes in the invaded cell population at distinct landscape ranges due to directional (fitness slope uphill and downhill) and divergent (fitness valley) selection. Expt., experimental; Inf., inferred; Sim., simulated.
BACH1ʼs role as a metastasis activator is generating interest in BACH1 inhibition by hemin for therapy development in TNBC 31 . Such initiatives illustrate the widespread yet simplistic assumptions and naive expectations from protein inhibitors across drug development, pharma and clinical trials. However, the nonmonotone invasion landscape predicts that BACH1 inhibition could unwantedly promote invasiveness on downslopes of the landscape. To examine this possibility, we modeled how hemin treatment lowers BACH1 levels (Extended Data Fig. 6a,b and Supplementary Notes 1.1) and predicted that applying 1.2 μM hemin at a downslope (0.5 ng ml −1 dox) or 3.2 μM hemin at an upslope (10 ng ml −1 dox) should reduce BACH1 levels to resemble 0.3 ng ml −1 dox and 0.5 ng ml −1 dox induction, respectively (Fig. 3g). Flow cytometry confirmed the expected BACH1 level reductions (Extended Data Fig. 6c,d), and transwell assays proved that BACH1 reduction lowers invasiveness at high BACH1 expression, yet it promotes invasion at mid-range BACH1 (Fig. 3h). These reproducible observations confirm the antagonistic effects of BACH1-reducing hemin treatment on cellular invasiveness (Extended Data Fig. 6e), which was unlikely attributable to BACH1ʼs influence on cell proliferation 27,31 , landscape is independent of native BACH1. a, Western blot examination of native BACH1 protein level dose response in low-noise mNF-BACH1 and mNF-GFP MB231 samples. Native BACH1 bands are highlighted in the red rectangle. b, Quantitation of native BACH1 protein level dose response in low-noise mNF-BACH1 and mNF-GFP MB231 samples using grayscale normalization to internal reference β-tubulin in a. c, mRNA level dose responses of truncated BACH1 isoform, BACH1t, in low-noise mNF-BACH1 MB231 clones. BACH1t is transcribed only from the native copy of BACH1 gene. One-way ANOVA with Tukey's multiple comparisons tests between each dose and uninduced controls, n = 3, *P < 0.05, **P < 0.01. d, Invasiveness significantly decreased due to BACH1 knockout (KO) compared to the wild-type parental population. Invasiveness can be rescued by transient overexpression from the induced mNF-BACH1 circuit but not by BACH1t overexpression. Two-tailed t-test between BACH1-KO sample and every other condition, n = 3, *P < 0.05, **P < 0.01. e,f, Dose responses of MFI (e) and CV (f) for a selected MB231 BK monoclone with AAVS1 site-specifically integrated mNF-BACH1 circuit relative to the low-noise mNF-BACH1 (BL) clones (n = 3). g, Invasiveness of a selected MB231 BK + mNF-BACH1 circuit monoclone relative to low-noise mNF-BACH1 (BL) clones over increasing dox concentrations (n = 3). NS, not significant. Article https://doi.org/10.1038/s41589-023-01344-z because BACH1 upregulation curbed cell proliferation monotonically based on growth curves (Extended Data Fig. 7a,d) and doubling time calculations (Extended Data Fig. 7c,f) in mNF-BACH1 cells, without effects in mNF-GFP cells (Extended Data Fig. 7b,e). Thus, proliferation cannot explain the nonmonotone BACH1 invasiveness landscape.
Overall, these results suggest antagonistic, protein-leveldependent effects of BACH1 overexpression on the invasion of MB231 cells. Moreover, BACH1 expression noise can enhance invasion independently of the mean, in a landscape-dependent manner.

BACH1 invasion landscape mediates phenotypic selection
Cellular evolution can occur by selection of nongenetic variants 17,60 , according to the Price equation [61][62][63] . For example, non-genetic cell-cell differences in BACH1 levels could mediate phenotypic adaptation if BACH1ʼs concentration (1) varies from cell to cell; (2) correlates with fitness; and (3) persists between two consecutive observations under selection. Whereas fitness is typically related to cell proliferation, metastasis correlates better with cellular invasiveness, which depends on BACH1 expression according to a metastatic fitness landscape (Fig. 3g and Supplementary Notes 1.3). Cell populations climb that landscape under four main types of selection 64 , depending on the local geography. First, fitness upslopes impose positive directional selection, which enriches for cells with high BACH1 expression, increasing BACH1ʼs mean in invading cells without upregulation. Second, fitness downslopes impose negative directional selection, with effects opposite to positive directional selection. Third and fourth, as we derive by manipulating the Price equation (Supplementary Notes 1.4), fitness peaks and valleys impose stabilizing and disruptive selection, which should reduce or amplify BACH1ʼs variance, respectively. Overall, testing how BACH1 expression mean and variance change in invading cells could validate the nonmonotone BACH1 invasion landscape.
To test nonmonotone phenotypic selection along the landscape, we performed invasion assays at increasing dox concentrations and examined BACH1 expression profiles of invading MB231 cells harvested below the membrane insert of Boyden chambers. Interestingly, expression distributions and means of the invading versus seeded low-noise (Fig. 4a,b) and high-noise (Extended Data Fig. 8a,b) mNF-BACH1 cells shifted differently at various dox doses: upward at 0, 0.1,1, 2 and 10 ng ml −1 dox but downward at 0.3 ng ml −1 and 0.35 ng ml −1 dox. Strikingly, at 0.5 ng ml −1 and 0.6 ng ml −1 dox, BACH1 distribution in invading cells broadened, and their CV increased compared to the seeded cells, as expected from disruptive selection at a fitness valley. Meanwhile, eGFP distributions of invading and seeded cells were indistinguishable over multiple doses in both mNF-GFP clones, corroborating the flat invasion landscape (Fig. 4c,d and Extended Data Fig. 8c,d).
The expression shifts are not due to genetic mutations, because passaging and reinducing invading cells harvested from 10 ng ml −1 dox caused their BACH1 expression to return to its original distribution, supporting selection of phenotypic variant cells (Extended Data Fig. 8e,f).
Previously, we showed that fitness, like other phenotypes, can vary across genetically identical cells 40,52 . Likewise, the invasiveness of individual cells could differ markedly from the cell population's average invasiveness. To gain insight into single-cell invasiveness and confirm BACH1 expression shifts by phenotypic selection, we inferred single-cell invasion landscapes (Fig. 4e and Extended Data Fig. 9a-d) from experimental data (Supplementary Notes 1.5). Stochastic simulations 65 of cells that invade on single-cell invasion landscapes (Extended Data Fig. 9e,f), according to their fluctuating log 10 (BACH1) levels, confirmed that fitness upslopes/downslopes and valleys can cause the experimentally observed shifts in BACH1 expression mean and variance (Fig. 4f), according to theoretical predictions based on the Price equation (Extended Data Fig. 9g,h).
Overall, mathematical and computational models of various selective invasion effects on cellular BACH1 expression explained histogram shifts observed experimentally, validating the nonmonotone invasion landscape of BACH1 expression in MB231 cells.

Native BACH1 does not cause nonmonotonicity
Ectopic BACH1 transcription is under synthetic gene circuit control, practically unaffected by native transcriptional regulation. On the other hand, BACH1 overexpression perturbs BACH1ʼs native regulatory network (Fig. 1a), which contains multiple feedback loops 29 , and other interactions. Thus, seeking clues to the nonmonotonicity, we investigated if ectopic eGFP::BACH1 had a nonmonotone effect on native BACH1 expression, the two proteins being distinguishable by western blotting based on their size (Fig. 5a). With a peak around 0.5 ng ml −1 dox induction, native BACH1 levels changed in a nonmonotone manner opposite to the invasion landscape upon ectopic BACH1 tuning in MB231 mNF-BACH1 cells, without change in mNF-GFP cells (Fig. 5b). We confirmed these trends at the mRNA level, focusing on the C-terminus-truncated mRNA isoform BACH1t 54 co-expressed with native BACH1 (Fig. 5c).
To further investigate how native BACH1 affects nonmonotonicity, we stably deleted the native BACH1 gene by CRISPR-Cas9, creating the MB231 BACH1-knockout cell line ( Supplementary Fig. 4a,b). BACH1ʼs deletion upregulated its transcriptional target HMOX1 and lowered MB231 BACH1-knockout invasiveness compared to parental MB231 cells ( Supplementary Fig. 4c). These phenotypes were rescued by transient reintroduction of ectopic BACH1, but not BACH1t, which lacks the DNA-binding domain 54 (Fig. 5d). Next, we integrated and tested the Fig. 6 | Nonmonotone transcriptional regulations of metastasis-related targets from BACH1 consistent with nonmonotone invasion landscape. a, RKIP mRNA level changes for increasing dox concentrations in both lownoise mNF-BACH1 (BL) and high-noise mNF-BACH1 (BH) MB231 clones with respect to the corresponding uninduced sample. n = 3, one-way ANOVA with Tukey's multiple comparisons tests between each dose and uninduced controls, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. b, Immunofluorescence quantitation of RKIP protein dose responses in both low-noise mNF-BACH1 (BL) and high-noise mNF-BACH1 (BH) MB231 clones with respect to the corresponding uninduced sample. n = 3, one-way ANOVA with Tukey's multiple comparisons tests between each dose and uninduced controls, *P < 0.05, ***P < 0.001, ****P < 0.0001. c, Dose-responsive RKIP expression correlates negatively with nonmonotonic BACH1 invasion landscape, r = −0.8766, P = 0.0002. Samples were averaged with n = 3 technical replicates and normalized to the mean of corresponding uninduced sample, Pearson correlation. d, RKIP-dependent invasiveness changes at 0, 0.1, 0.5, 10 ng ml −1 dox concentrations. RKIP was overexpressed at 0.1 ng ml −1 and 10 ng ml −1 dox levels where it was significantly more suppressed, and RKIP was knocked down via shRNAmir at 0 and 0.5 ng ml −1 dox levels where it was much less suppressed. Two-tailed t-test between each experimental pair, n = 3, *P < 0.05, **P < 0.01, ***P < 0.001. e, HMOX1, MMP1 and CXCR4 mRNA level changes at increasing dox concentrations in low-noise mNF-BACH1 (BL) MB231 clones with respect to the corresponding uninduced sample. n = 3, one-way ANOVA with Dunnett's multiple comparisons test between each dose and uninduced controls, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. f, Regulatory network model that illustrates how iFFLs originating from BACH1 and converging on metastasis effectors such as CXCR4 and MMP1 can potentially underlie the nonmonotone invasion landscape and transcriptional regulation. Combining sharp and gradual response functions of opposite effects (activating and inhibitory) results in nonmonotone response functions. Solid lines indicate known regulatory pathways, whereas dashed lines indicate indirect, somewhat hypothetical pathways. Sigmoidal or gradual functions on the connections represent the characteristics of response curves for pairs of regulators. g, Invasion landscape interpolated based on invasiveness data points versus both mean and noise (CV) of log 10 BACH1 expression, using biharmonic spline interpolation. h, Top: comparison between RNA levels of BACH1 and RKIP in TNBC (n = 28) cell lines and non-TNBC (n = 25) breast cancer cell lines. Bottom: BACH1 mRNA expression in major breast cancer cell lines classified into TNBC and non-TNBC categories. Met, metastasis, Ox., oxidative; RPKM, reads per kilobase of transcript per million mapped reads. Article https://doi.org/10.1038/s41589-023-01344-z mNF-BACH1 gene circuit with a silent BACH1 mutation to avoid cutting by Cas9 in MB231 BACH1-knockout cells ( Supplementary Fig. 4a and Fig. 5e,f). Invasion assays revealed a nonmonotone invasion landscape resembling that of the low-noise mNF-BACH1 clone (Fig. 5h). Thus, nonmonotonicity is independent of native BACH1ʼs presence, but BACH1 deletion reshapes somewhat the MB231 BACH1 invasion landscape.

Transcriptional regulation of BACH1 targets consistent with invasion landscape
Higher levels of transcriptional repressors should reduce the levels of their target proteins. BACH1 is a direct transcriptional repressor of metastasis suppressor RKIP 29 (Fig. 1a), so increasing BACH1 should reduce RKIP expression 66 . Indeed, The Cancer Genome Atlas (TCGA) gene sets and biological functions antagonistically correlated with RKIP and BACH1 overlapped largely, being enriched in functions such as cell motility (Extended Data Fig. 10a,b). However, if RKIP contributes to BACH1ʼs invasion effects, then RKIP's response to BACH1 upregulation might be nonmonotone. To test this, we measured RKIP mRNA levels at increasing dox doses using qRT-PCR in both low-noise and high-noise mNF clones from each cell line. Remarkably, RKIP levels responded to BACH1 upregulation nonmonotonically (Fig. 6a). Increasing BACH1 significantly suppressed RKIP at lower (0.1 ng ml −1 ) and higher (5 ng ml −1 and 10 ng ml −1 ) dox doses, as expected. However, at intermediate dox doses (0.5 ng ml −1 and 1 ng ml −1 ), BACH1 upregulated RKIP, all the way to overexpression. Immunofluorescence corroborated this observation at the protein level (Fig. 6b). Consistently, gene expression data from clinical breast tumor samples revealed that, although RKIP correlates inversely with BACH1 overall, the two transcripts cease to correlate when BACH1 surpasses a threshold, as in our cell line experiments (Extended Data Fig. 10c,d).
To confirm that RKIP contributes to BACH1ʼs invasion effects, we uncovered a significant negative correlation by plotting relative invasiveness versus relative RKIP expression at increasing dox concentrations (Fig. 6c). RKIP overexpression or BACH1 reduction by hemin or shRNAmir 67 , an shRNA embedded into a microRNA backbone, lowered the invasiveness of MB231 LP parental or mNF-GFP cells (Extended Data Fig. 10e). Transiently expressing RKIP or RKIP-targeting shRNAmir decreased and increased invasiveness, respectively (Fig. 6d). Overall, the strong correlation between the nonmonotone invasion landscape and nonmonotone RKIP regulation by BACH1 suggests that BACH1ʼs invasion effects might involve RKIP.
Curious if BACH1ʼs nonmonotone effects extend to other transcriptional targets, we tested how CXCR4 and MMP1 (ref. 30) respond to ectopic BACH1 tuning. We found nonmonotone expression changes that matched the invasion landscape. In contrast, HMOX1 expression decreased monotonically ( Fig. 6e and Extended Data Fig. 10f), similarly to CCND1 expression and cell proliferation upon BACH1 tuning (Extended Data Fig. 10g). Previous studies as well as our calculations (Supplementary Notes 1.6 and 1.7) suggest that BACH1-driven incoherent feedforward loops (iFFLs) can generate the nonmonotone responses 68 (Fig. 6f). Overall, the expression of multiple metastasis-related BACH1 targets, including RKIP, CXCR4 and MMP1, corroborates the nonmonotone BACH1 invasion landscape, suggesting that they might mediate BACH1ʼs effects in MB231 cells. On the other hand, some monotone responses to BACH1 tuning indicate at least two distinct modes of BACH1-driven transcriptional regulation: by iFFLs and by simple regulatory links (Fig. 6f).

Discussion
Protein levels are closer to cellular phenotype than most other molecular characteristics of single cells. Protein level-to-phenotype (P2P) mapping is a crucial step in solving the outstanding genotype-to-phenotype problem of modern biology and medicine 69 and could advance prognosis and treatment of diseases, including cancer. Cancer cells develop abnormality both as a cause and consequence of deviant protein levels 17,29,52 . Artificial, precise protein level perturbations could provide important, causal information for the P2P map 19 , requiring technologies that are available 70 but insufficiently used across human cell lines. Consequently, quantitative, noise-aware P2P mapping is lacking for many phenotypes, including metastatic behaviors. To close this technology-knowledge gap, we demonstrate two-step SSR/SHS genome modification for SHS-specific, repeatable, single-copy integration of protein level tuning synthetic gene circuits, which we apply to map TNBC cell invasiveness versus the average and variability of BACH1 levels.
In this study, we combined SSR/SHS with synthetic gene circuit engineering to precisely control both the average and variance of BACH1 levels in a TNBC cell line. Contrary to the assumption that higher mean levels of a metastatic activator should promote more invasion 30 , we reveal a nonmonotone invasion landscape (Fig. 6g), showing that BACH1 can suppress invasion within a certain expression range. Furthermore, BACH1 nonmonotonically regulates the expression of multiple genes, such as RKIP 29,30 , CXCR4, MMP1 and even BACH1 itself, acting alternatively as an activator and a repressor over various expression ranges. We propose that such nonmonotone effects arise from combining opposite (activating and repressing) sharp and gradual responses to BACH1 (Fig. 6f) via iFFLs 68 . The microRNA Let-7 might be part of these iFFLs, because its targets can respond sharply to upstream expression changes 29 . Moreover, both BACH1 knockdown and overexpression repress SNAI2 expression 71 , and BACH1 can both activate and repress its targets 28 , which further imply iFFLs. However, HMOX1 and other regulatory targets have monotone responses to BACH1, indicating at least two distinct modes of gene regulation (Fig. 6f). Breast cancer cell line RNA sequencing (RNA-seq) data 21,72 indicate higher BACH1 and lower RKIP expression in TNBC cell lines, suggesting TNBC-specific alterations of this regulatory network (Fig. 6h). Anticorrelation between BACH1 and RKIP in the TCGA breast cancer dataset up to a BACH1 level echoes our findings in MB231 cells. Identifying the unknown BACH1-RKIP interactors mediating this nonmonotonicity warrants further studies. Studies in bacteria 33,73 , yeast 34,35 and human cells 41 indicate that nonmonotone fitness landscapes are common, including the effects of oncogenes and other drug targets, raising concerns about adverse effects of inhibiting them simplistically. Indeed, lowering BACH1 levels 31,32 around landscape downslopes can adversely promote invasion and possibly metastasis. Counterintuitively, near such downslopes BACH1 upregulation may be desirable to reduce both invasiveness and cell proliferation, to improve clinical benefits. If cell lines represent inter-patient or intra-tumor diversity, then intermediate BACH1 expression in MB231 cells (Fig. 6h) suggests that BACH1 suppression might boost invasiveness in TNBC. Likewise, gene therapies or immunotherapies should consider the adversity of improper therapeutic gene expression by phenotypic landscape mapping.
We recently suggested 19 that gene expression noise can aid or hinder drug resistance or metastatic steps 6,10,17,19,74 . Likewise, BACH1 expression noise can facilitate TNBC cell invasion at high BACH1 levels, whereas, at low BACH1 levels, noise may hinder invasion, as BACH1 noise interplays with the geography of the invasion landscape (Fig. 6g). So, suppressing or enhancing noise accordingly by chemicals 75,76 or gene circuits 52 could diminish metastatic tendencies. However, to avoid unwanted side effects, such noise control should also consider mapping cellular fitness versus the CV.
It will be interesting to similarly define uni-dimensional and multi-dimensional landscapes by recruiting new SHSs 77 for other genes, phenotypes and human cell types in vitro or in vivo, exploring their predictive value for long-term evolution 52,74,78 . Expression shifts by phenotypic selection 64 could confirm the landscapes, including fitness valleys that allow evolutionary branching 79 or possibly phenotypic bifurcations 80 of cancer cells in the proliferation-invasion space. These research strategies should be scalable to many genes and cell types, enabling quantitative phenotypic landscape mapping to unravel disease biology or to improve the accuracy and efficiency of drug development.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, Article https://doi.org/10.1038/s41589-023-01344-z acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41589-023-01344-z.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.

Cell culture
MDA-MB-231 and HEK293 cells were from the American Type Culture Collection. All engineered versions of HEK293 cells (referred to as 293) were cultured in DMEM media with 10% FBS and 1% penicillinstreptomycin. All engineered versions of MDA-MB-231 cells (referred to as MB231) were grown in RPMI 1640 media with 5% FBS and 1% penicillin-streptomycin. Both cell lines were maintained in Panasonic MCO-170AICUVL-PA cellIQ Series CO 2 incubators at 37 °C and 5% CO 2 and passaged regularly every 2-4 d. The cells were used in experiments within 15 passages after their arrival in the laboratory.

Plasmid construction
For the generation of the LPutopia-bearing cell lines (LP clones), the LPutopia-7 genome-targeting vector was constructed based on the earlier version LPutopia-3 assembled from cloning vectors DC-RFP-SH01 (human AAVS1 safe harbor gene knock-in kits and clones, GeneCopoeia) and Addgene plasmid 92078 PB_CMV_GFP_FRT. The LPutopia-7 vector contains HA regions to the human AAVS1 locus, which can site-specifically recombine after double-strand break generation by CRISPR-Cas9. In between the HA regions, LPutopia-7 contains a reporter-selectable cassette consisting of a CMV promoter-driven eGFP reporter and a thymidine kinase (TK) promoter-driven neomycin resistance gene, all flanked by heterotypic FRT sequences (FRT/FRT3). Outside the HA regions, the vector bears RFP as a secondary negative selection marker, which wards against random integration events.
For constructing the RMCE vectors, we first built pUt-NF-BACH1 that contains a TetR-based mNF gene circuit controlling the expression of the GFP::BACH1 fusion based on the BACH1 open reading frame (ORF) from GeneCopoeia (NM_206866, HPRM54453). The mNF cassette was obtained from Addgene plasmid 128253, pDN-D2irTN2AG5kwh. Besides the mNF unit, the vector also included a separate CMV promoter-driven copy of the blasticitin resistance gene BsrS2 from pUNO-CodA::Upp (InvivoGen) as a positive selection marker for successful RMCE events. Heterotypic FRT sequences (FRT/FRT3) oriented as in the LPutopia-7 vector flanked both the mLin gene circuit and the BsrS2 gene. Outside the FRT/FRT3 sequences, a PGK promoter-driven herpes simplex virus (HSV) TK gene obtained from pHR(KRAS.B)-GFP (GeneCopoeia) was also integrated into the vector as a negative selection marker for non-specific integration events. The RMCE vector pUt-NF-BACH1-P2A-GFP was subsequently constructed by adding the P2A sequences between the BACH1 and GFP ORFs. Finally, the RMCE vector pUt-NF-GFP was obtained by deleting the BACH1 ORF from the mNF-BACH1 plasmid, keeping everything else the same as described above.

PCR genotyping and copy number determination
PCR genotyping was performed using 50-100 ng of genomic DNA with OneTaq DNA Polymerase (OneTaq Quick-Load 2× Master Mix with Standard Buffer, New England Biolabs, M0486S) in 25-μl reactions. Primer sequences for constructs in Extended Data Fig. 1a,f are listed in Supplementary Tables 2 and 3.
The relative transgene integration copy number for each monoclonal sample was determined for eGFP with the TaqMan Copy Number Assay, using the TaqMan Fast Advanced Master Mix (Thermo Fisher Scientific, 4444557). For each qPCR reaction, 100 ng of genomic DNA was run using the QuantStudio 3 Real-Time PCR System (Eppendorf, A28137) in standard curve mode. We used equal amounts of purified genomic DNA of each sample and human RNase P (RPPH1) as internal reference (Supplementary Table 4). eGFP copy number in every clone was calculated based on determined copy number of RPPH1 reference in both 293 genomes and MB231 genome using the ΔΔCt method.

Transfection and cell sorting
We applied lipofection to transfect all HEK293-derived cells. Before transfection, cells were plated in six-well plates and grown to ~80% confluence. Then, LPutopia-7/espCas9 or pCAG-Flpo/NF circuit donor vector combinations were co-transfected at a 1-to-1 ratio with a final mass of 2.5 μg per well. The vectors were first incubated with 3.75 μl of Lipofectamine 3000 (Invitrogen, L3000-015) in OPTI-MEM media (Gibco, 31985062) for 15-30 min. The resulting DNA-lipid complex was then pipetted onto the cells and then incubated for at least 24 h before refreshing media. Appropriate drugs for selection were added 72 h after transfection. Drug selection lasted for at least 14 d before fluorescence-activated cell sorting (FACS). We used 1,000 μg ml −1 of G418 for HEK293 LP cell selection and 10 μg ml −1 blasticidin with 10 μg ml −1 ganciclovir for HEK293 mNF cell selection.
For all MDA-MB-231-derived cells, nucleofection was performed based on the manufacturer's instructions and recommendations. In brief, newly thawed MB231 cells were plated in a T-25 flask and subcultured 3-5 d before nucleofection. Next, cells were harvested by adding trypsin and counted using a Cellometer Auto T4 (Nexcelom Bioscience). Around 2 × 10 6 cells were collected and centrifuged at 200g for 10 min at room temperature. Then, the supernatant was removed, and cells were resuspended in 100 μl of room temperature Nucleofector Kit V (Lonza, VCA-1003) solution. The LPutopia-7/espCas9 or pCAG-Flpo/ NF circuit donor vector combinations were co-transfected at a 1-to-1 ratio with a final mass of 2 μg per sample. The cell/DNA suspension was transferred into the certified Nucleofector cuvette, and the X-013 program of Nucleofector 2b Device (Lonza, AAB-1001) was applied. Finally, transfected cells were buffered with fresh media and gently transferred into a freshly prepared six-well plate. Drug selection started 24-48 h after nucleofection and lasted for at least 14 d before FACS. We used 700 μg ml −1 of G418 for MB231 LP cell selection and 5 μg ml −1 blasticidin with 10 μg ml −1 ganciclovir for MB231 NF cell selection.
After verification of the native BACH1 knockout, MB231 BACH1-knockout cells were further co-transfected with LPutopia-7 donor vector and AAVS1 sgRNA to generate the stable MB231 BK-LP parental cells as described above. Meanwhile, the ectopic BACH1 Nature Chemical Biology Article https://doi.org/10.1038/s41589-023-01344-z sequence was single-site mutated at nucleotide 177, changing it from C to T, to disable the PAM site recognition by Cas9, to avoid unwanted cutting of the ectopic BACH1 copy. Later, the silent-mutated pUt-NF-BACH1 circuit was exchanged into the MB231 BACH1-knockout-LPutopia cells through the same RMCE process and selection method as for the other cell lines. We enriched for recombinants and then performed monoclonal screening to minimize the unpredicted side effects of genome instability induced from constitutively expressed Cas9.

Fluorescence microscopy
Microscopy was performed 48 h after induction. Cells were imaged in 24-well plates before flow cytometry using a Nikon Eclipse Ti-E inverted microscope with a DS-Qi2 camera (14-bit) for phase contrast and fluorescence images. A ×10 Ph1 objective (type: CFI Plan Fluor) was used in phase contrast and fluorescence mode imaging.

Flow cytometry
For each sample, newly thawed cells were cultured for one passage before the experiment. Next, around 50,000-80,000 cells harvested from 80% confluent T-25 flasks were plated into 24-well plates, with three technical replicates for each inducer concentration. Dox was added into each well to obtain concentrations ranging from 0.01 ng ml −1 to 100 ng ml −1 . Cells were incubated for 2 d (48 h) and then collected into a 96-well plate at a final volume of 250 μl per well and then read on a BD LSRFortessa flow cytometer with High Throughput Sampler at the Stony Brook Genomics Core Facility. GFP fluorescence signal data from at least 10,000 events for each dox concentration were collected within a predefined FSA/SSA gate in the FITC channel with identical PMT voltage settings across all induction levels of every sample ( Supplementary Fig. 5).
To determine the reproducibility of gene expression histograms, cells were freshly thawed and incubated for the first week before flow cytometry testing. Then, we performed the dose-response measurements for three technical replicates as described above repeatedly over 4 weeks for each sample at several selected dox concentrations and compared the fluorescence intensity means between measurements from three timepoints.

RNA isolation and qRT-PCR
For qRT-PCR, 100,000-300,000 cells were first pre-induced for 48 h with each dox concentration in six-well plates. RNA was then isolated using the RNeasy Plus Mini Kit (Qiagen, 74134). Then, 1 μg of total RNA from each sample was converted to cDNA using iScript Reverse Transcription Supermix (Bio-Rad, 1708841). Next, qPCR reactions were set up using TaqMan Fast Advanced Master Mix (Thermo Fisher Scientific, 4444557) with TaqMan Gene Expression Assay and run using the QuantStudio 3 Real-Time PCR System (Eppendorf, A28137) in standard curve mode, using the TaqMan probes listed in Supplementary Table  4. On the other hand, customized qPCR primers for detecting BACH1t were designed using PrimerBlast (NCBI) and ordered from IDT and were verified for specificity and efficiency using serial-diluted positive and negative control DNA. The GAPDH reference primer pair was a pre-designed product from IDT (Hs.PT.39a.22214836) and was also verified for efficiency using genomic DNA control. Eventually, we quantitated BACH1t mRNA levels using the PowerUp SYBR Green Master Mix (Thermo Fisher Scientific, A25741) with GAPDH level as reference in three independent repeats. BACH1t primer sequences are listed in Supplementary Table 4.

Immunofluorescence
In total, 300,000-500,000 cells were first pre-induced over 48 h with each dox concentration in six-well plates or T-25 flasks. Cells were next harvested by trypsin and neutralized with fresh media, followed by centrifugation at 500g for 5 min and vacuum aspiration of the supernatant. Cells were then fixed with 750-1,000 μl of freshly made 4% paraformaldehyde at room temperature for 15 min, followed by washing with 750-1,000 μl of PBS and centrifugation at 500g for 5 min. The supernatant was discarded thoroughly, and cells were resuspended for 30 min in 750-1,000 μl of ice-cold methanol at −20 °C. Then, cells were washed with 750-1,000 μl of PBS again and centrifuged at 500g for 5 min, followed by vacuum aspiration of the supernatant. Cell pellets were resuspended in 100 μl 1:50 BACH1 antibody (Santa Cruz Biotechnology, sc-271211 AF647) or 100 μl 1:50 RKIP antibody (Santa Cruz Biotechnology, sc-376925 AF647) diluted in incubation buffer (1× PBS and 0.5 g of BSA) and incubated for 1 h at room temperature protected from light. After incubation, cells were washed again with 500 μl of excess incubation buffer and centrifuged at 500g for 5 min, with the supernatant discarded. Finally, cells were fully resuspended in 500 μl of PBS and run on a BD LSRFortessa flow cytometer, collecting about 10,000 events per sample. Fluorescence readouts were collected from the red APC channel with the PMT voltage set to 350 V, and the readout of a few samples was normalized to 350 V based on voltage reference samples.

Hemin preparation and treatment
Hemin (Sigma-Aldrich, H9039-1G) was prepared in 10 mM NaOH solution and further diluted in culturing media to the desired concentrations for cell treatment. For expression level measurement, hemin was added into cell culturing media after 48-h dox induction and maintained for 48 h together with dox before flow cytometry. For invasion measurements, hemin was added into cell culturing media after 48-h dox induction and maintained for 48 h together with dox before the Boyden chamber assays.

Boyden chamber invasion assays
The 24-well invasion assay plates were purchased from Thermo Fisher Scientific (353097). Each Boyden chamber membrane was coated with a thin layer of 1× Basal Membrane Extract (BME) solution (diluted from 5× Basal Membrane Extract solution, Thermo Fisher Scientific, 3455-096-02) and incubated overnight at 37 °C. Cells were pre-induced at each dox concentration for 48 h and serum-starved for another 24 h while maintaining the dox concentration constant. Then, the cells were trypsinized and centrifuged at 500g for 5 min, followed by two rounds of PBS washes to remove any remaining serum-containing media. Then, the cells were resuspended and roughly diluted to a 0.5 × 10 6 concentration (three replicate measurements using the Nexcelom Cellometer). Then, we seeded 100 μl of serum-free media from each suspension with 45,000-60,000 cells for each Boyden chamber, setting up three replicates in separate chambers for each dox concentration. We used 10% serum as the chemoattractant in these assays. After 16-24 h, to stain for live cells, we applied NucBlue Live ReadyProbes (Thermo Fisher Scientific, R37605) to the membranes for 2 h at 37 °C in the dark. Cells in the top chamber were removed from the membrane with a wet cotton swab. Next, cells in the bottom chamber were imaged in the DAPI/GFP/BF channels using the microscopy setup described above. We imaged five random fields within the insert using a ×10 Ph1 objective (type: CFI Plan Fluor). To calculate invasiveness, we estimated the total under-membrane area from the imaged area, using an area factor of 21.54, because each frame was 1.18 × 1.18 mm 2 , and the total area was 0.3 cm 2 . Then, we multiplied the area factor with the average cell count from five random fields to estimate the total invading cell number, which we divided by the total seeding cell number to obtain invasiveness. We estimated invasiveness for three replicates in each dox condition and presented the results as mean ± s.d.

Proliferation assays
For proliferation assays, cells were dox induced 48 h before seeding, and 3,000-5,000 cells (depending on the cell line) were plated in 96-well plates with 12 replicates in each dox condition. Around 6 h after seeding, the first three replicates were assayed using alamarBlue HS Cell Viability Reagent (Invitrogen, A50100) for viable cells. Cells were incubated in alamarBlue reagent for 4 h, and then absorbance measurements were taken at wavelengths of 570 nm and 600 nm, with media blank control using a Tecan Infinite Pro 200 spectrophotometer. Each of the remaining three replicates was then successively measured every 24 h until 72 h endpoint. Each absorbance value was adjusted by subtracting the media blank absorbance at the same wavelength. Cell proliferation was measured as the alamarBlue reduction score (S) calculated as: where O 1 and O 2 are the molar extinction coefficients of oxidized alamarBlue at 570 nm and 600 nm, respectively, and A 570 and A 600 are the absorbances of test wells at 570 nm and 600 nm, respectively. Relative proliferation was then calculated as the fold change between the average scores of the induced wells to uninduced control wells. Cell doubling times T d were calculated from the average relative proliferation fold change between timepoints 0 h and 48 h, as follows: (2) ln( where t 2 and t 1 are the times of measurement (48 h and 0 h, respectively), and f 2 and f 1 are the average relative proliferation fold changes measured at times t 2 and t 1 , respectively.

Immunoblotting (western blotting)
The tissues were lysed in RIPA buffer supplemented with protease and phosphatase inhibitor at 5 mg ml −1 concentration. The supernatant containing proteins was collected after centrifuging tissue lysates at 12,000 r.p.m. at 4 °C. Protein concentration was determined by the BCA protein assay kit, and 20 μg of protein samples was boiled and loaded onto SDS-PAGE gels. The gels were transferred to 0.22-μm nitrocellulose membranes and blocked with 5% non-fat milk in 1× Tris-Buffered Saline containing 0.1% Tween 20 (TBST). The membranes were incubated with primary antibodies against BACH1 (Santa Cruz Biotechnology, sc-271211) and β-tubulin (Santa Cruz Biotechnology, sc-55529) at 4 °C overnight. After three washes (15 min, 5 min and 5 min) with 1× PBS containing 0.1% Tween 20 (PBST), the membranes were incubated with a rabbit secondary antibody conjugated with horseradish peroxidase (1:2,000) for 1 h, followed by three washes (15 min, 5 min and 5 min) with PBST. A chemiluminescence reagent kit was used to visualize protein bands with horseradish peroxidase secondary antibodies.

Breast cancer cell line expression analysis
Raw RNA-seq data of BACH1 and RKIP were directly acquired from the cBioPortal database with its source from the Cancer Cell Line Encyclopedia. The original cell line annotation from the encyclopedia did not contain TNBC status information. So, we assigned TNBC versus non-TNBC status to the breast cancer cell lines based on the existence of ER, PR and HER2 markers with the reference to previous report. The expression distributions of both BACH1 and RKIP were arranged from high to low level with TNBC and non-TNBC subtypes separately marked. MDA-MB-231 cell status was particularly labeled in both distributions.

TCGA and gene set enrichment analysis
We normalized RNA-seq results from TCGA BRCA samples (provisional, n = 1,100) directly downloaded from the cBioPortal database (https://www.cbioportal.org). Likewise, we downloaded gene lists correlated with BACH1 and RKIP, as cBioPortal already has such correlation matrices generated for the TCGA BRAC provisional set. BACH1 expression baseline was defined to be the average expression of samples in which BACH1 was diploid, and BACH1 lower and higher expression groups were classified based on the z-score relative to the baseline. Functional gene set enrichment analysis (biological process enrichment analysis and molecular function enrichment) of the gene sets that correlate with BACH1 and RKIP was performed using the web-based interface of PANTHER (http://pantherdb.org). To identify processes and functions enriched in the input gene lists, we used Gene Ontology annotation categories.

Data processing and statistical analysis
Flow cytometry data were analyzed with FlowJo software version 10 (Becton Dickinson). Forward-scatter and side-scatter gates were predefined for each cell type or assay based on the reference sample pre-tests to exclude debris. Also, a fluorescence-based gate was imposed for FACS for desired target cells. Imaging data were collected and mainly analyzed using Nikon Elements AR version 4.40.00 (Build 1084). Fiji (ImageJ 1.52a) and the Image Processing Toolbox from MATLAB (MathWorks) were also used for image processing and analysis. Most of the data plots as well as statistical analysis were generated and performed using MATLAB or GraphPad Prism 8 (GraphPad Software). Statistical details are in the figure legends, including the statistical tests used. In all figures, results are presented as mean ± s.d. unless otherwise noted in the figure legend. *P < 0.05 was considered statistically significant, as indicated by an asterisk in the figure legend.

Computational modeling and mathematical derivations
We used MATLAB (R2020b) for computational analyses and simulations. We converted single-cell expression data to the log space by taking their log 10 values. The flow cytometry histograms became approximately Gaussian. Following previous work 13 , we developed exact simulations of Ornstein-Uhlenbeck processes 65 according with the means and standard deviations matching those of the log-transformed data. We simulated cell invasion by a standard Monte Carlo approach, allowing each cell to invade if a random number pulled from a standard uniform distribution was lower than the landscape value of that cell's simulated log 10 (BACH1) levels. The number of such invading cells versus the original cell number defined the simulated invasiveness. Simulated histograms of invaded and control cells were generated by binning log 10 (BACH1) levels. For details on inferring the landscape and performing the simulations, see Supplementary Notes 1.3 and 1.5.
We used standard algebra and properties of moments for stochastic variables to derive the shifts in the mean and variance based on the Price equation. For details, see Supplementary Notes 1.4.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
The authors declare that all data supporting the findings of this study are available in the article and its supplementary files. Data for the main figures and extended data figures are provided in the source data files and supplementary information files for the supplementary figures. Raw data can be accessed at https://openwetware.org/wiki/CHIP:Data. Source data are provided with this paper.

Code availability
Custom MATLAB code files are available at https://github.com/ gaborbalazsi/BACH1Landscape. Applying the same hemin treatment levels on the unfused BACH1 and GFP target revealed direct connection between BACH1 protein and GFP fluorescence intensity when they are fused, n = 3. (f) Relative changes of fluorescence intensity at 10, 20, 30, 50 and 80 μM hemin concentration compared to no treatment control in both mNF-BACH1 (top) and mNF-GFP (bottom) MB231 clones, n = 3. (g) Invasiveness universally drops over 50% in all selected MB231 clones upon 48 hours of 50 μM hemin treatment. Mean ± S.D. of n = 3 independent replicates. One-way ANOVA within each condition group, P > 0.05. Unpaired two-tailed t-test for each comparison between control and hemin treated sample, **P < 0.01, ***P < 0.001, ****P < 0.0001. from a mathematical model of dose-response of hemin-promoted BACH1 degradation using best-fit parameters to experimental data with autofluorescence background subtraction. BL: low-noise clone; BH: high-noise clone. Modeling details can be found in Supplementary Notes 1.1. (c)(d) Experimental identification of hemin concentrations needed to achieve BACH1 degradation to specific levels in both low-noise and high-noise clones. Starting from 0.5 ng ml −1 and 10 ng ml −1 Dox-induction, we added hemin to reduce BACH1 levels equivalent to 0.3 ng ml −1 and 0.5 ng ml −1 Dox induction, respectively. n = 3, unpaired two-tailed t-test, P > 0.05. (e) Reproducibility of invasiveness in two independent experiments for both mNF-BACH1 clones at multiple Dox doses (Left) and both mNF-GFP clones without induction (Right).