4th generation nonsteroidal aromatase inhibitors: An iterative SAR-guided design, synthesis, and biological evaluation towards picomolar dual binding inhibitors

One in every eight women will be diagnosed with breast cancer during their lifetime and approximately 70% of all patients are oestrogen receptor (ER) positive depending upon oestrogen for their growth accounting for third generation aromatase (CYP19A1) inhibitors being the mainstay in the treatment of ER-positive breast cancer. Despite the success of current aromatase inhibitors, acquired resistance occurs after prolonged therapy. Although the precise mechanisms of resistance are not known, lack of cross resistance among aromatase inhibitors drives the need for a newer generation of inhibitors to overcome this resistance alongside minimising toxicity and adverse effects. Novel triazole-based inhibitors were designed based on previously published parent compound 5a , making use of the now available crystal structure of CYP19A1 (PDB 3S79), to make modifications at specific sites to explore the potential of dual binding at both the active site and the access channel. Modifications included adding long chain substituents e.g. but-2-ynyloxy and pent-2-ynyloxy at different positions including the most active compound 13h with IC 50 value in the low picomolar range (0.09 nM). Aromatase inhibition results paired with molecular dynamics studies provided a clear structure activity relationship and favourable dual binding mode was verified. Toxicity assays and CYP selectivity profile studies for some example compounds were performed to assess the safety profile of the prepared inhibitors providing the basis for the 4th generation nonsteroidal aromatase inhibitors.


Introduction
Breast cancer is a complex collection of heterogenous neoplastic, often recurrent, diseases affecting 1 in 8 women [1][2][3][4]. The number of women diagnosed with breast cancer has risen from 1.68 million to 2.1 million between 2012 and 2018 ranking first place of highest incidence malignances among women with 24% and only second to lung cancer among all populations [3][4][5]. Different subclasses of breast cancer are responsible of 14.3% of all female cancer related deaths [5,6], approximately 70% of which show dependence on oestrogen/ER signaling for their growth, thus called ER positive [7].
As early as the 1890s, hormonal targeted therapy gained a crucial role over the years in oestrogen deprivation and control of breast cancer [8,9]. ER-α is often directly involved in tumour growth providing the basis for two different classes of antihormonal therapy; interference with binding of oestrogen to the receptor, which is divided into selective oestrogen receptor modulators (SERMs) that are competitive inhibitors of the ER e.g. tamoxifen, and selective oestrogen receptor degraders (SERDs) e.g. fulvestrant ( Fig. 1) [10]. The second strategy is oestrogen deprivation via inhibition of aromatase (CYP19A1) responsible for oestrogen synthesis e.g. anastrozole, letrozole and exemestane ( Fig. 1) [9,10]. Despite the success, high efficacy and selectivity of the currently available third generation aromatase inhibitors (AIs), acquired resistance eventually occurs after prolonged therapy along with some cross-activity to other cytochrome P450 (CYP) family members (e.g. anastrozole's inhibition of CYP1A2, letrozole's inhibition of CYP2A6) and some androgenic and/or weak ERα agonistic activity [11][12][13]. Considerable efforts have been made to date in relation to designing further compounds, some with improved IC 50 values compared with the clinically-approved reference compounds and so with promising AI activity. This is the case either for steroidal or nonsteroidal AIs especially after the crystal structure of aromatase (PDB 3EQM) was published [14]. Therefore the design and synthesis of a new generation of inhibitors is needed to widen the therapeutic options available owing to the risk of resistance towards available drugs, and also to minimise toxicity and reduce the non-specific and adverse effects by increasing aromatase selectivity [7,11,15]. This led to a continuous exploration of alternative strategies with allosteric inhibition rising as a possibility especially after letrozole and one of the metabolites of tamoxifen; namely endoxifen, were reported to have the potential of non-competitive/mixed inhibition for aromatase [16,17]. This type of inhibition indicates the presence and identification of allosteric sites in aromatase. Three potential allosteric sites were identified through computational studies including the haem proximal site along with two access channels connected to the active site [18][19][20]. The front door access channel gated by Asp309 was identified in the crystal structure published in 2009 (PDB 3EQM) [14]. On the other side of the active site, a backdoor channel gated with Ser314 was discovered through the crystal structures published in 2018 (PDB 5JKV, 5JKW, 5JL6, 5JL7 and 5JL9), which is postulated to be involved in the passage of catalytic water [14,21].
Allosteric non-competitive inhibition of aromatase offers major advantages over conventional inhibitors in terms of reaching maximum inhibition without the complete blockage of oestrogen production, which would reduce side effects and delay or avoid the onset of resistance along with better selectivity owing to allosteric sites being less conserved across other enzymes. Also, they are not outcompeted by high concentrations of androgens as natural substrate, making this approach appealing for rational drug discovery [18]. Even though there are no reported effective allosteric inhibitors of aromatase, some novel steroidal compounds with a dual-binding ability for the access channel and the active site were reported in 2012 [22].
These potent inhibitors were characterised by a long alkyne side chain at C6 of the steroidal scaffold that could fit through the narrow hydrophobic access channel, which is the main transport route for steroids, water molecules and proton sources, giving rise to the concept of 4th generation steroidal aromatase inhibitors with dual binding capacity ( Fig. 2A). The alkyne side chains are proposed to sit in the access channel preventing entry of other molecules and to displace water molecules believed to act catalytically, by proton relay through the Arg192-Glu483 salt bridge pair at the channel entrance and Asp309 within the channel, in the enolization of androstenedione [22]. This is extendable to the non-steroidal AI class as some nonsteroidal xanthone compounds with the same pent-2-yne arm (Fig. 2B) were reported to have aromatase inhibitory activity in low micromolar range in 2020 and so this concept is gaining more popularity and may present the molecular basis for the design of 4th generation non-steroidal AIs [9,18,22].
As an early part of this project, 1-[(6-methoxybenzofuran-2-yl)phenylmethyl]triazoles (Fig. 3, R 1 = H, R 2 = OCH 3 , R 3 = varying functional groups) were previously reported as potent aromatase inhibitors, presenting a suitable scaffold for incorporation of the hydrophobic tail required for filling the access channel in a dual orthosteric/allosteric inhibition [23]. In this paper, three main variants were studied, the first of which was the position of the hydrophobic tail (Fig. 3, R 1 , R 2 or R 3 ) to provide an answer to the preferable binding mode. Then the length of the hydrophobic tail was investigated as it was not guaranteed that a five-carbon chain length would be the optimal length for activity in a non-steroidal compound. Along with the nature of the secondary substitution, a more comprehensive SAR was developed with toxicity assessment and CYP selectivity of the compounds with promising aromatase inhibitory activity. These questions were addressed through an iterative process of a design/synthesis/inhibitory evaluation cycle to guide the rational drug design in three distinct comparisons (Fig. 3).

Chemistry
The synthetic pathways for the previously reported parent AIs (5) and the newly synthesized modified compounds (6 and 7) are outlined Fig. 1. Clinically used selective oestrogen receptor modulators (SERMs, e.g. tamoxifen), selective oestrogen receptor degraders (SERDs e.g. fulvestrant) and aromatase inhibitors (AIs, e.g. steroidal exemestane, non-steroidal anastrozole and letrozole). in Scheme 1. Initially, the parent methoxy substituted compounds were prepared using the reported synthetic pathway as indicated in Scheme 1 with modifications to the methods made where necessary to optimize the reactions [23].
The Rap-Stoermer formation of the benzofurans was performed using the original method with NaH as the base, however it was replaced by K 2 CO 3 as reported by Mahboobi et al., to improve the yield (80-96%) [24]. Also, the equivalents of SOCl 2 in the third step were increased from equimolar to 1.6 equivalents to optimize the method by reducing the reaction time from 4 days to only 16 h with yields ranging from 31 to 63%.
The new molecules with extended substitutions on the benzofuran ring, ranging from ethoxy to pent-2-ynyloxy, were prepared via a divergent pathway (Scheme 2) starting with the hydroxy salicylaldehyde derivative after being pyran protected (8) [23,25]. Following the methods described in Scheme 1, the ketones (9) were obtained in good yields (52-95%) and the alcohols (10) obtained in quantitative yield. The introduction of the triazole proceeded with loss of the pyran protection to give the triazole phenol compounds (11 and 12, 44-86%). A final nucleophilic substitution step was required at the end of the synthetic pathway to add the longer alkyloxy chains through Williamson ether synthesis to give the 6-O-alkyl/alkyne (13) and 5-O-pent-2-ynyloxy (14) derivatives. The fluoro and chloro derivatives were obtained in good yields (50-90%), however the nitrile derivatives (13g and 13h) were obtained in low yields (7 and 17% respectively) owing to complex reaction mixtures.
The 4 ′ -pent-2-ynyloxy compound (19) with the extended substitution on the phenyl side of the compound required a slightly modified pathway (Scheme 3) with the first step being a demethylation of 3c to form 15, which was then pyran protected (16) and proceeded as described in Scheme 2 from the reduction step (17).
The final triazole products were confirmed by 1 H and 13 C NMR with  purity determined by HPLC with all compounds ≥95% pure. The two characteristic singlets of the triazole group were observed at ~8.2 and 8.0 ppm in 1 H NMR and the CH-triazole as either a singlet or finely split triplet (J = 0.9 Hz) at ~ 6.5 ppm in 1 H NMR and at ~ 61.5 ppm in 13 C NMR.
inhibitors (0.001pM-100pM). Aromatase activity results were determined as a concentration of product formed per mg of protein per hour. Each data point was measured in triplicates and the error in the IC 50 calculations represented as 95% confidence interval.
The effect of the secondary substitution, i.e. substitution of the phenyl ring, on the 6-but-2-ynyloxy and 6-pent-2-ynyloxy derivatives was then evaluated. The choice of secondary substituents, Cl, F and CN, was determined from our previous published research [23] as the most promising substitutions for aromatase inhibition.

Toxicity (BrdU) assays
Examples of the compounds, namely 5b, 11b, 13e, and 13h, were tested at 1 μM over 48 h along with doxorubicin (Dox) as positive control by bromodeoxyuridine (BrdU) proliferation assay to evaluate the toxicity against non-cancerous breast epithelial cells (MCF-10A) and non-oestrogen dependent breast cancer cells (MDA-MB-231). Statistics using one-way ANOVA followed by a Tukey's Multiple Comparison test comparing all compounds against control showed no significant difference between the tested compounds and the negative control indicating that the compounds had no impact on MCF-10A or MDA-MB-231 growth (Fig. 4). These results suggest possible limited toxicity in normal breast tissue and little off-target effects.

Computational studies
Using the crystal structure of human placental aromatase (CYP19A1) refined at 2.75 Å (PDB 3S79) [21], the compounds (R-and S-enantiomers) were docked using molecular operating environment software (MOE) [28], and the best poses were selected based upon the 3D visual inspection and the score value of the ligand-protein complex (Table S1), which were all subjected to 150 ns MD simulations with longer 400 ns simulations run for exemplar compounds, using the Desmond programme of Schrödinger software [29,30] to equilibrate and establish an optimal complex. Regarding the position of the substituent, the 6-substituted benzofurans (13) and the 4-substituted phenyl derivatives (19) formed more stable complexes than the 5-substituted analogues (14) (Fig. S1).
Studying the binding profile through the simulation time (Fig. S2) and the ligand interactions of the final frame of the MD simulation  ( Fig. S3) indicated that both enantiomers of the simpler methoxy derivative (5a) can comfortably fit in the haem binding pocket. Introduction of the extended pentynyloxy group restricted fit within the enzyme with the 6-pentynyloxy S-13a more optimally positioned for the access channel compared with R-13a, although both bind with the haem bind through the N4 of triazole. For the 5-pentynyloxy benzofurans (e.g. 14) the S-enantiomer also showed a more favourable binding profile, while the R-enantiomer of 14 binds to the haem through the N2 of triazole. In contrast, the R-enantiomers of the phenyl substituted derivatives (e.g. 19) bound with the haem through the N4 of the triazole ring, while the S-enantiomers bound with the haem through the N2 of the triazole ring. For optimal haem binding the azole should be perpendicular (~90 • ) to the plane of the haem group, which is normally achieved through binding with the N4 of the triazole, however binding via the N2 of the triazole does not allow a perpendicular interaction, which would result in a less favourable conformation for binding.
Varying the length of the alkyl chain substituent at the 6-position of the benzofuran group did not show a significant difference in the stability of the complex and all compounds showed good 3D fitting, with the methoxy derivatives fitting in the haem active site pocket e.g. 5a (Fig. 5A), and the longer chain substituted derivatives fitting both the haem active site and the access channel gated by Arg192, Asp309, His480 and Glu483 e.g. 13b (Fig. 5B) with the chloro substituent forming a binding interaction with the key amino acid Met374.
The S-enantiomer of the lead compounds, 6-pent-2-ynyloxy 4 ′ fluorosubstituted (13e) and 6-but-2-ynyloxy 4' nitrile-substituted derivative (13h) were optimally positioned within the haem and access channel binding sites and formed a direct N4-triazole haem binding interaction with a distance of 2.37 and 2.38 Å respectively (Fig. 6C and D). The pent-2-ynyloxy chain of R-13e was not optimally positioned in the access channel and, although the triazole N4 was close to the haem iron (2.43 Å) a direct bond was not observed (Fig. 6A). For both enantiomers of 13h the nitrile group formed binding interactions with Met374 and directly or indirectly with Arg115 ( Fig. 6B and D) and the but-2-ynyloxy chain was optimally positioned within the access channel.
From computational studies it is proposed that the improved aromatase inhibitory activity of the nitrile derivative (13h) may be owing to the ability of both R-and S-enantiomers to fit optimally within both binding pockets and interact with the haem through the N4 of the triazole ring with anchoring of the ligands through a H-bonding interaction between CN and Met374/Arg115 (Fig. 7), and future experimental studies will involve experimental studies to validate the computational findings. The excellent toxicity and selectivity profile of these compounds also supports the basis and rationale for continuing research, subject to funding, for these 4th generation nonsteroidal aromatase inhibitors.

Materials and instrumentation for the chemical synthesis
All commercially available starting materials and solvents were of general purpose or analytical grade and used without further purification. Melting points were determined using a Gallenkamp melting point apparatus and are uncorrected. 1 H and 13 C NMR spectra were recorded on a Bruker Advance DP500 spectrometer operating at 500 MHz and 125 MHz, respectively. Chemical shifts are given in parts per million (ppm) relative to the internal standard tetramethylsilane (Me 4 Si). Analytical thin layer chromatography (TLC) was carried out on precoated silica plates (ALUGRAM® SIL G/UV254) with visualisation via UV light (254 nm). HPLC were either performed by the Department of Pharmacy & Pharmacology, University of Bath, Bath, UK on a Zorbax Eclipse plus C18 Rapid resolution 2.1 × 50 mm, 1.8 μm particle size using gradient (methanol: H 2 O) with 0.1% formic acid (method A) or in house on a Shimadzu LC-2030C Plus C18 Rapid resolution 250 × 4.6 mm, 5 μm particle size using isocratic 80:20 (methanol: H 2 O) (method B). All biologically evaluated compounds are ≥95% pure by HPLC analysis.

Computational studies 4.4.1. Docking
The crystal structure of human placental aromatase (CYP19A1) refined at 2.75 Å (PDB 3S79) [21], was downloaded from the protein data bank (https://www.rcsb.org). Missing hydrogens were added, and the charge and geometry of the iron atom were adjusted as previously described [31]. Using the site finder tool in molecular operating environment (MOE) 2015-10 software [28], the active site was chosen to contain the main amino acid residues and the haem molecule. The amino acids constituting the wall of the active site contained Arg115, Ile133, Phe134, Phe221, Trp224, Ile305, Ala306, Asp309, Thr310, Val370, Leu372, Val373, Met374, Leu477, Ser478. The 3D structures of the ligands (R-and S-enantiomers) were generated using MOE builder, energy minimised and saved in a dataset ready for docking studies. The complexes for molecular dynamics (MD) studies were prepared by docking the compounds using MOE.

Molecular dynamics
Molecular dynamics simulations were performed using Schrödinger 2020-1 Desmond programme [29,30] as previously described [32]. Briefly, using the pdb files containing the selected docking poses, the structures were optimised with protein preparation wizard. The volume of space in which the simulation takes place, the global cell, is built up by regular 3D simulation boxes. The orthorhombic water box allowed for a 10 Å buffer region between protein atoms and box sides. Overlapping water molecules were deleted, and the systems were neutralised with Na + ions and salt concentration 0.15 M. Molecular dynamics (150 ns and 400 ns simulations) were performed using OPLS_2005 forcefield at 300 K and constant pressure (1 bar).

Cell culture
JEG-3 cells were purchased from ATCC and grown in Eagle's Minimal Essential Medium (EMEM) supplemented with 10% fetal calf serum (FCS). MCF-10A cells were a gift from Prof. Christopher McCabe (University of Birmingham) and were grown in Dulbecco's Modified Eagle Medium (DMEM) supplemented with 20 ng/ml epidermal growth factor (EGF), 100 ng/ml cholera toxin, 0.01 mg/ml insulin, 500 ng/ml hydrocortisone, and 5% horse serum (Sigma). MDA-MB-231 cells were purchased from ATCC and grown in Roswell park Memorial Institute medium (RPMI1690) supplemented with 10% FCS. All cells were cultured at 37 • C under 5% CO 2 in a humidified incubator.

Aromatase activity assay
Aromatase activity was assayed using a modified titrated water assay as previously reported [28]. JEG-3 cells were grown in 1 mL EMEM to approximately 80% confluence in six-well cell culture plates. Androst-4-ene-3,17-dione[1β-3 H] was dissolved in serum-free cell culture medium and added into each well. Aromatase activity was measured in the absence and presence of inhibitor (0.001pM-100pM). After a 1 h incubation at 37 • C followed by a 5-min incubation on ice, 500 μL of culture medium was taken from each well. Medium was vortexed with 2% dextran-treated charcoal (Sigma-Aldrich) in PBS and centrifuged at 4000 rpm. The supernatant containing the product, [ 3 H] H 2 O, was quantified by scintillation counting. Cell protein concentrations were determined using Pierce BCA assay kit (Thermo Fisher Scientific). Aromatase activity results were determined as a concentration of product formed per mg of protein per hour (pmol/mg/h). Results were shown as a % change in activity compared to control. Each data point was measured in triplicates and the error in the IC 50 calculations represented as 95% confidence interval.

4.7.
BrdU-based cell proliferation assay to assess drug toxicity MCF-10A and MDA-MB-231 cells were plated onto 96-well microtiter tissue culture plates in RPMI1690 medium at a density of 8 × 10 3 cells/well (for MCF-10A) or 5 × 10 3 cells/well (for MDA-MB-231. Groups were treated with either DMSO alone (at no greater than 0.01%) as a vehicle control, or at a dose of 1 μM of inhibitor or doxorubicin control, for 48 h. Effects of drug treatment on cell growth were detected using the BrdU cell proliferation assay (Roche) according to the manufacturer's recommendations. The BrdU colorimetric immunoassay is a quantitative cell proliferation assay based on the measurement of BrdU incorporation during DNA synthesis. After treatments 20 μL/well of BrdU were added to each well, followed by an incubation of 2 h at 37 • C. The cells were subsequently fixed, and the DNA denatured. Anti-BrdU-peroxidase immune complexes were detected by substrate reaction and quantified in an ELISA reader at 370 nm.

Notes
The authors declare no competing financial interest.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.