Discrepancy in interactions and conformational dynamics of pregnane X receptor (PXR) bound to an agonist and a novel competitive antagonist

Graphical abstract


Introduction
Pregnane X receptor (PXR), also known as nuclear receptor (NR) subfamily 1 group I member 2, encoded by the gene NR1I2, is a ligand-dependent transcriptional factor that is activated by a structurally diverse set of small molecules [1]. PXR binds various xenobiotic compounds, such as endocrine-disrupting chemicals and pharmaceutical drugs, and endogenous ligands, such as hormones. Ligand-bound PXR regulates the transcription of genes encoding phase I and phase II drug metabolizing enzymes [2] as well as uptake and efflux transporters [3,4]. PXR activation has an impor-tant role in drug-drug interactions (DDIs) [5], adverse drug reactions [6] and drug treatment efficacy [4,7,8]. In this regard, regulatory agencies, including the European Medicines Agency (EMA) [9] and the United States Food and Drug Administration (FDA) [10], have introduced in vitro assays for PXR activation and in vivo CYP expression levels in their pipelines for evaluation of drug safety. In addition to its role in small molecule metabolism, PXR is involved in regulation of diverse cellular processes including energy homeostasis, cell proliferation and inflammation [11,12]. PXR expression adapts to (patho) physiological [13] and environmental stimuli [14].
PXR structure comprises three domains: DNA binding domain (DBD), hinge region and ligand-binding domain (LBD). PXR interacts with the gene promoter region of target genes in DNA via its N-terminal DBD [15]. The hinge region is reported to be target of post-translational modifications, which affect PXR mediated gene regulation. For instance, acetylation and deacetylation of K109 modulates PXR transcription activity [16]. LBD comprises eleven a-helices, in addition to the aAF-2 helix (also known as activation function-2), and five-stranded b-sheet (Fig. 1A). The eleven ahelices form three aligned groups: a1/a3, a4/a5/a8/a9 and a7/ a10/a11. PXR lacks the typical stable a2 and a6 helices that are found in other NRs [17]. With other NRs the ordered a6 helix results in more tightly packed smaller LBD [17,18]. Moreover, while many NRs exhibit a three-stranded b-sheet in their LBD, PXR comes with two additional strands. These extra b-sheets, together with the additional a2 0 , contribute to a larger and more flexible LBD, when compared to other NR-LBDs [19][20][21]. This increased flexibility allows accommodation of diverse set of ligands to the ligand binding pocket (LBP) (Fig. 1B), which is a unique characteristic of PXR.
PXR heterodimerizes with retinoid X receptor alpha (RXRa) in cell cytoplasm and this complex (PXR-RXRa) is transported to the nucleus [23]. Binding an activator ligand to PXR-RXRa heterodimer in nucleus allows the exchange of cofactor (release of a corepressor and recruitment of a coactivator) [24]. Subsequently, this activated PXR complex regulates expression of the target gene.
While the knowledge of PXR ligand-induced activation is well established, the mechanism of action for ligand-mediated inhibitory/antagonistic effects on PXR is less understood [25]. Despite of the numerous PXR agonists that have been reported, a very limited number of antagonists exist. Antagonists binding to the LBP include sulforaphane [26], and SPA70 [27] as shown by in vitro ligand-binding competition assays. Other PXR antagonists, such as ketoconazole or FLB-12 [28] can reduce the endogenous PXR activation without directly or exclusively binding to the LBP, whereas coumestrol [29,30] shown to bind to the LBP and AF-2 domain. What renders a PXR ligand an agonist or an antagonist, and their respective structural triggers, remains poorly understood. For instance, Lin et al. discovered that a close analogue of SPA70, SJB7, is a PXR agonist [27]. Structural differences between SPA70 and SJB7 are minimal, it only exists in their terminal aromatic ring substituents (SI Fig. S1). Lin et al. hypothesized that SJB7 interacts through its p-methoxy group with a hydrophobic spot on aAF-2 (residues L428 and F429) and stabilizes the aAF-2 to enable interaction with a coactivator, while SPA70 fails in this due to the lack of this group. More recently, Li et al. reported a set of SPA70 analogues, revealing diverse biological activities of these ligands, ranging from agonists to antagonists and partial agonists [31]. This exemplifies how subtle structural changes may completely shift PXR ligand function and it highlights the promiscuity of PXR-LBD.
No co-crystal structures of PXR and antagonist are currently available to elucidate the details of the PXR-antagonist interactions. Also, for instance, docking approaches are limited as they are unable to capture PXR's characteristic flexibility [21,[32][33][34][35][36] and the effect of water [37]. Therefore, molecular dynamics (MD) simulations have been utilized for better understanding of transition state of active to inactive in nuclear receptors [38]. Previous PXR-related MD simulations have mainly focused on PXR agonists. Chandran et al. studied the dynamic behavior of PXR-LBD apo  [22]). The regions of interest are highlighted with the following colour scheme that is used throughout this article: a1-a2 0 loop (residues 177-198), dark grey; b-b1 0 (residues 211-225), pink; b1 0 -a3 loop (residues 226-234), dark green; a3-helix (residues 240-260), cyan; b4-a6 loop (residues 309-314), a6 (residues 315-318) and a6-a7 loop (residues 319-323), light green; a10/a11 (residues 389-417) light brown; aAF-2 (residues 423-434), dark brown; coactivator peptide (SRC-1), blue. The dashed rectangular area denotes the location of the ligand binding pocket. (B) Ligand-binding pocket (LBP) of PXR. The main residues forming the LBP and participating in ligand binding are depicted in stick model with transparent molecular surface. Residues are coloured according to their respective regions (see A). (C) 2D structures of PXR agonist SR12813 (SRL), our in-house kinase inhibitor (compound 100), which was found to act also as a competitive PXR antagonist [42], and P2X4 antagonist BAY-1797 that is a PXR agonist [43]. Structure of compound 100 includes a benzosuberone moiety, a fluorophenyl ring and a benzyl group (structural moieties that are discussed in the results). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) structure in comparison to the agonist-bound state, using short MD simulations of 100 ns [39]. They could identify several conformational states for apo PXR-LBD with different volume through MD simulation. Their study revealed that the SR12813 agonist binding events restrict the LBD in a conformation relevant to the size and shape of the ligand. Further, Motta et al. employed the MD-binding method to simulate the SR12813 entry into the LBP. Their result suggested that the ligand would enter to the LBP via a channel between a2 and a6 helices [40]. They also implemented scaled MD (SMD) simulations to extend the sampling of the bound conformations of SR12813 within PXR-LBD (with a total of 2 ls).
They predicted that the binding mode of SR12813 observed in the crystal structure PDB ID: 1NRL [22] is the most stable one among other available PXR-SR12813 complex structures. In addition, Huber et al. performed 200 ns MD simulations of wild type (WT) PXR-LBD and W299A mutant, without ligand and with T090131713 (agonist), SPA70, and SJB7 [41]. They suggested that the extra space conferred by the W299A is the reason for the observed antagonist-to-agonist switch with this mutant for SPA70. This extra space let SPA70 to reside deeper in the pocket, preventing the aAF-2 dislocation and maintaining PXR active.
We recently discovered a novel competitive PXR antagonist from the Tübingen kinase inhibitor collection (TüKIC) compound library [42]. This competitive antagonist, compound 100 ( Fig. 1C), suppresses both rifampicin-and SR12813-induced PXR activation, and it does not induce recruitment of SRC-1 to PXR in coactivator recruitment mammalian 2-hybrid assay [42]. Here we aimed to disclose why this kinase inhibitor acts also as an antagonist when in complex with PXR and how it differs from a typical agonist (SR18213). To this end, we applied microsecond timescale MD simulations to understand how these ligands influence PXR's conformational dynamics. Furthermore, we experimentally tested a selection of PXR-LBD mutants and their influence on PXR activity. Finally, we simulated PXR-LBD in complex with BAY-1797 [43], a weak PXR agonist, that shares structural similarity to our competitive antagonist. Our results highlight ligand-specific dynamical behaviour of PXR-LBD and suggest key-changes in ligand-induced antagonism.

Microsecond timescale molecular dynamics simulations reveal discrepancy in motions of compound 100 and SR12813 bound PXR-LBD
To investigate and to compare the interactions and conformational dynamics of the novel competitive antagonist and the classical agonist, we conducted a total of 60 ls unbiased all-atom MD simulations. These simulations comprised three systems: PXR-LBD in complex with compound 100 (C-100; total simulation time of 30 ls), SR12813 (SRL; 20 ls) and SR12813 together with SRC-1 coactivator peptide (SRL + Co; 10 ls) (SI Fig. S13).
First, to gain a better understanding of the PXR-LBD dynamics, we conducted principal component analysis (PCA) to identify the most essential motions of the protein. The first and the second principal components (PCs) describe together 56% of the data (PC1: 33% and PC2: 23%), while other PCs exhibit individual contributions below 10% (SI Table S1); thus, we focused our analysis on PC1 and PC2 (Fig. 2). The most extensive motions of PC1 occur in a1-a2 0 loop ( Fig. 2B; SI Movie M1). In addition, minor movement is observed in the b4-a6 loop (N-terminus of a6 region) and b1 0 -a3 loop. Of note, these regions, excluding the b4-a6 loop, are part of the novel insert of PXR-LBD that is not found in other NRs. This feature enables PXR-LBD to bind to a wide range of ligands [19].
Next, we shifted our focus on the differences in the PC scores among the systems. PC1 scores of the compound 100 and SR12813 bound systems are distributed in a wide range ( Fig. 2A). However, higher values of PC1 are observed with C-100 ( Fig. 2A, SI Fig. S2). Even greater difference is seen with PC2, where C-100 displays clearly higher values. Furthermore, a joint comparison of the PC scores between C-100 and combined agonist systems exemplifies the observed differences (Fig. 2D). Generally, higher PC1 scores are observed for the C-100 compared to the agonist systems (medians of 1.56 Å and À1.77 Å for C-100 and agonists, respectively). Differences in PC2 are more evident (medians of 1.75 Å and À2.77 Å for C-100 and agonists, respectively). Interestingly, PC2 represents the dislocation of aAF-2 from a3-helix ( Fig. 2B; SI Movie M2), a movement which is associated with PXR antagonism [41], and b4-a6 loop association to a2 0 ( Fig. 2B; SI Movie M2).
Overall, PCA exemplifies a clear ligand-dependent conformational behaviour of PXR-LBD.

Conformational behaviour of the a6 region is ligand-dependent
We next pursued for a more detailed analysis of these PCAhighlighted dynamic regions of PXR-LBD. First, we focused on the a6 region (residues 309-323), comprising b4-a6 loop (Nterminus of the region), a6 helix and a6-a7 loop (C-terminus of the region) (Fig. 3A). Movement of b4-a6 loop was associated to both PC1 and PC2, appearing even more extensive with PC2 ( Fig. 2B, SI Movie M2). To inspect the conformation of this loop, we calculated the distance between A312 (located on b4-a6 loop) and C207 (located on C-terminus of a2 0 ) (Fig. 3A). Clearly smaller distances are observed in the presence of compound 100 (median of 8.2 Å), whereas both agonist-bound systems display significantly longer distances between these two residues (medians of 17.8 Å and 18.6 Å for SRL + Co and SRL, respectively) (Fig. 3A). This indicates that the b4-a6 loop favours an open configuration with SR12813, where this loop resides far from a2 0 (Fig. 3B). Conversely, a closed conformation is preferred with compound 100, where the b4-a6 loop is close to a2 0 . This closed conformation appears to be stabilized via a H-bond between A312 and C207, which is not observed with the agonist (Fig. 3C). Furthermore, we noted that the closed conformation of the b4-a6 loop appears to stabilize the secondary structure of a6 helix (Fig. 3D). While the helical configuration is dominated with compound 100, in agonist-bound systems this secondary structure is more unstable.
Motta et al. predicted that the water channel between a2 and a6 would be an entry pathway for SR12813 [40]. Therefore, we next investigated water-mediated interactions in these regions. Again, notable differences among the systems with these interactions appeared ( Fig. 3B; SI Fig. S4). For instance, in C-terminus of a6 region, a water-bridged interaction between E321 and H407 (located on a10/11) is frequent in both agonist systems (60-74%), while it is relatively absent in C-100 (below 6%). Moreover, H407 forms a water-bridged interaction with M323 with agonist (56% and 71%, SRL and SRL + Co, respectively), which is again almost inexistent with the compound 100 (Fig. 3B). N-terminus of a6 region is connected to the C-terminus of a2 0 helix (residue K210) with system specific water-bridged interactions (SRL and SRL + Co, E309; C-100, E309 and D310) (SI Fig. S4). Regardless of the differences observed for water bridged and H-bond interactions in a6 region, a comparable interaction profile among systems appears between D205 (located on a2 0 ) and R410 or R413 (located on a10/11) (SI Fig. S4). Overall, the a6 region exhibits a liganddependent configuration, which is associated with specific intramolecular interactions within the PXR-LBD.

Different polar interactions contribute to stabilization of compound 100 and SR12813 in PXR-LBP
The analysis of a6 region interactions revealed that in the presence of compound 100, unlike with SR12813, H407 is not involved in water-mediated interactions between a6-a7 loop and a10/11. As H407 is one of the key residues of PXR-LBP commonly participating in interactions with ligands [44,45], we next shifted our attention to protein-ligand interactions (Fig. 4). Based on ligand RMSF values, both ligands are relatively stable throughout the simulations (SI Fig. S5); thus, the data is suitable for comparing interactions between the ligands and among systems. While H407 displays a stable H-bond with SR12813 ($80-100%), only a water-mediated interaction occurs with compound 100 ($15%) ( Fig. 4A-B). With compound 100, H407 prefers a conformation where it forms a water bridge ($46%) to N404 (located on a10/11). This conformational preference shifts H407 away from the a6 region and beyond the reach of the compound 100 ( Fig. 4C). Interestingly, our simulations display a stable H-bond interaction between N404 and G278 (on a4) with SR12813 ($93-95%) and this interaction appears only with 23% frequency in C-100. This discrepancy could be the result of the N404 involve-ment in the water mediated interaction with H407. Compound 100 displays H-bond and water mediated interactions to S247 and Q285 from its amide (15-20%). From these residues, SR12813 exhibits a direct H-bond interaction only with S247. The missing Hbond to Q285 is compensated with water-mediated interactions. Similarly, increased frequency of water-mediated interaction (58%) appears in SRL, where diminished H-bond interaction to S247 exists (Fig. 4B). Furthermore, compound 100 displays an additional water-mediated interaction to T248 -albeit with low frequency ($10%) -from its NH linker between benzosuberone and fluorophenyl. Overall, limited polar interactions are observed for both ligands, with a clear difference in their H407 interactions. The most important interaction region for different PXR ligands is the hydrophobic subpocket, which is formed by a triad of hydrophobic residues: F288, W299 and Y306 [45]. Here, we observed a similar protein-ligand interaction pattern in the presence of SR12813 (with and without coactivator) or compound 100 (Fig. 4D). SR12813 forms hydrophobic interactions to F288 and Y306 for about 30-40% and to W299 for $40-50% in both systems. Similarly, compound 100 displays hydrophobic interactions between benzyl group and F288, W299 (60%) and Y306 (approximately 45%). The slightly increased frequency of hydrophobic interactions for compound 100 could be attributed to the structural characteristics of the compounds. Compound 100 does not contain any polar atoms in its terminal group that binds to the hydrophobic subpocket, while SR12813 contains an aromatic hydroxyl group in its structure binding to this region which can explain the observed shift of the Y306 and F288 side chain in C-100 compared to agonist systems ( Fig. 4D; SI Fig. S6).
Collectively, both ligands show similar hydrophobic interaction patterns within the PXR-LBP, with the exceptions of F251 (located on a5) and aAF-2 residues F420 and F429. Since F420 is located on the loop connecting a11 to aAF-2 and this loop has a critical role in the aAF-2 localization [46,47], we explored F420 interactions to its neighbouring residues (SI Fig. S7B). With SR12813, F420 displays hydrophobic interactions to L411 (located on a11) for roughly 28-40% in both agonist systems. This interaction is infrequent with compound 100 ($8%). F420 also interacts with I414 for 18-30% in both agonist systems, while diminished interactions (only $3%) appear with compound 100. Interestingly, RMSF of F420 is higher (2.4 Å) with C-100 than with SRL (1.2 Å), demonstrating a high flexibility of this amino acid residue with compound 100.
2.5. aAF-2 is stabilized with SRL12813 and destabilized with compound 100 Compound 100 exhibited diminished interactions with aAF-2. All known ligand-dependent nuclear receptors require aAF-2 domain for an effective interaction with a coactivator [48]. This domain plays a crucial role in the formation of a suitable platform for the coactivator binding on the LBD surface. Therefore, we next investigated more closely the behaviour of aAF-2 in different systems.
First, we monitored the distance between aAF-2 and a3-helix ( Fig. 5A-B). As a3-helix (residues 240-260) is stable in all simula-tions (RMSF < 1 Å), this distance enables the assessment of the relative position of aAF-2 to the LBD. In the presence of compound 100, this distance is increased (median of 13.6 Å) compared to what is observed for SR12813 (medians $11 Å). Furthermore, we noticed that the unfolded loop-like conformation of aAF-2 is slightly preferred with compound 100, in comparison to SR12813 (Fig. 5C). The presence of the coactivator further stabilizes the alpha-helical secondary structure of aAF-2 based on SRL + Co.
Based on the crystal structure of PXR-LBD-SR12813 with coactivator, a H-bond interaction of T248 and T422 stabilizes aAF-2 closer to a3-helix (Fig. 5D). In the simulations, distance between the hydroxyl-oxygens of these residues is increased with compound 100 (median of 7.2 Å) (Fig. 5D), decreasing the H-bond frequency between these residues to 5%. In SRL + Co system, the median value of this distance is 2.9 Å and H-bond contact appears with approximately 80% frequency. Without the coactivator, SR12813 displays values between these two systems (median of 5.4 Å, H-bond frequency 30%). Finally, we analysed the spatial orientation of aAF-2 relative to LBP using angle calculations (Fig. 5E). To this end, we selected F281 (located on a5) as the apex (see details of angle selection in methods). The angle between N404 and F429 vectors total simulation time of 10 ls). Also in this system, the configuration of aAF-2 is destabilized by compound 100 and its behaviour reflects that of C-100 (SI Fig. S8). The observed distance between aAF-2 and a3-helix with C-100 + Co (median of 13.7 Å) is not far from C-100. Furthermore, the distance between the hydroxyloxygens of T248 and T422 (median of 6.9 Å) is close to the C-100 than that of SRL and SRL-Co. Regarding the spatial orientation of the aAF-2 relative to LBP, the angles h and x of C-100 + Co fall in somewhere between C-100 and SRL systems. Finally, increased RMSF values of SRC-1 in C-100 + Co compared to SRL + Co, demonstrate the coactivator instability with compound 100. Therefore, even in the presence of SRC-1, compound 100 appears to destabilize the PXR-LBD surface on the aAF-2 region, encumbering the stable binding of SRC-1. This MD simulation data agrees with the experimental data that demonstrates the failure of compound 100 to recruit SRC-1 to PXR-LBD in coactivator recruitment mammalian 2-hybrid assay [42].

Markov state modelling reveals compound 100 specific PXR-LBD conformations
We next aimed for a deeper understanding of PXR-LBD conformational dynamics when in complex with compound 100. To this end, we conducted Markov state modelling (MSM) approach that enables the study of long timescale statistical dynamics of a protein by identifying relevant kinetic states (metastable states) and the probability distribution among these states [49,50]. MSM identified five metastable states (S I-V ) for the PXR-LBD bound to compound 100 ( Fig. 6; SI Fig. S9). The two most dominant metastable states, S IV and S V , appear with $29% and $30% equilibrium probabilities, respectively. Other states (S I , S II , S III ) display lower probabilities in the range of 10-17%. Overall, conformations of LBD subregions in metastable state derived structures are distinct from the agonist associated. Moreover, specific conformations are preferred in individual metastable states. For the flexible a1-a2 0 loop, which flexibility is reduced by compound 100 compared to SR12813 (Fig. 2C), a state-specific conformation appears in S I and S IV , while in S II, S III and S V there exist no clear configuration for this loop (Fig. 6). Of note, in the agonist-bound reference crystal structure this loop is disordered (SI Fig. S10). The b1-b1 0 loop in S I , S II , S IV (altogether $53%) adopts a clearly defined folded conformation, where it is tightly packed on the LBD (Fig. 6). Configuration of this region is more ambiguous in S III and S V . In the agonist bound crystal structure, b1-b1 0 loop appears in an extended conformation, more distant from the LBD ( Fig. 6; SI Fig. S10). Indeed, the distance between D219 (apex of b1-b1 0 loop) and a3-helix is smaller with compound 100 than with SR12813 bound systems (SI Fig. S11A). For the b4-a6 loop, almost identical conformation is represented in all states, where it resides close to the a2 0 (residue C207) (Fig. 6). Conversely, an open configuration for this loop is observed in the agonist-bound crystal structure (Fig. 6, SI Fig. S10). Again, distance between a3-helix and A312 is smaller in C-100 compared to SRL and SRL + Co (SI Fig. S11B). The b1 0 -a3 loop in S II , S IV and S V (altogether $73%) deviates the most from the agonist-associated conformation, while in S I and S III this deviation is not that evident (Fig. 6). Interestingly, the aAF-2 region appears in a quite welldefined conformation in S I , S II , S III and S V , (altogether $71%), where it is shifted away from the a3-helix (Fig. 6). Moreover, in S IV aAF-2 appears with a more disordered conformation. These aAF-2 configurations agree with the calculated aAF-2 related distances and angles (Fig. 5). Altogether, MSM revealed unique conformations for the PXR-LBD bound to compound 100 in four regions, b1 0 -a3 loop, b1-b'loop, b4-a6 loop and aAF-2, that are distinct from agonist-associated conformations.

Mutations provide insights to the PXR activation
We next evaluated experimentally the effect of eight different mutations on PXR activity and its ligand-induced activation (Fig. 7A). The alanine mutations involved selected LBD key residues with diverse locations around PXR-LBD. The competitive antagonist compound 100 induces moderately the reporter gene CYP3A4 expression with WT PXR at the tested concentration, while with the full agonists, rifampicin and SR12813, the inducement of the gene expression is manyfold (for more detailed biological characterisation of compound 100 see [42]). Mutations of the hydrophobic subpocket forming residues W299 and Y306 resulted in distinct outcomes. While W299A retains the inducibility by rifampicin and behaves similarly as wild type, Y306A renders PXR inactive without any ligand-inducibility (Fig. 7B). From the mutations of the polar residues participating in H-bond interactions (Q285, S247 and H407), S247A and H407A increased the basal level of PXR activity, transforming it into a constitutive active form. These mutations appear still inducible by the PXR agonist rifampicin, and compound 100 suppresses the activity of S247A. MD simulations displayed hydrophobic interaction between F281 and both ligands (Fig. S5). Nevertheless, F281A did not alter PXR activity or inducibility by the ligands. F429A, located in aAF-2, rendered PXR inactive, and was not inducible by the ligands. Finally, W223A mutation in the putative PXR homodimerization interface [51], also resulted in the loss of PXR activity. Overall, mutation analysis revealed the important role of the key residues in modulating PXR activity and activation.

Hydrophobic subpocket binding moiety does not explain PXR conformational behaviour associated to compound 100
Werner et al. reported a P2X4 inhibitor BAY-1797, which also activated PXR with a minimum efficacious concentration of 1.7 lM [43]. We got interested in BAY-1797, as it shares an identical phenylacetamide moiety with compound 100 (Fig. 1C). To investigate and compare the behaviour of BAY-1797 with compound 100, we carried out 10 ls MD simulations for PXR-LBD-BAY-1797. As a starting configuration for these simulations, we utilized a crystal structure of a close analogue of BAY-1797, which exhibits one additional methyl group compared to BAY-1797 (PDB ID: 6HTY [43]). The overall dynamical behaviour of the protein in the simulations (RMSF) follows a similar trend with B-factors of the crystal structure. (SI Fig. S3C). BAY-1797 is well accommodated in the PXR-LBP and its aromatic benzyl group is oriented into the hydrophobic subpocket (Fig. 8A). A comparable profile with compound 100 is observed for BAY-1797 in p À p interactions to W299, F288 and Y306 ( Fig. 8A and B, Fig. 4D). In contrast to compound 100, however, BAY-1797 displays a stable H-bond to Q285 and interaction to H407 (43%), which is closer to the SR12813 interaction profile (Fig. 8C). The secondary structure stability of a6-helix with BAY-1797 resembles compound 100 (Fig. 8D). Nevertheless, based on the distance of A312-C207 (located on b4-a6 loop and on C-terminus of a2 0 , respectively) BAY-1797 falls somewhere in between of SRL12813 and compound 100 (Fig. 8E), indicating unique conformation for this region. In addition, distance between aAF-2 and a3-helix in PXR-LBD-BAY-1797 appears in between the SR12813 and compound 100 systems (Fig. 8F). However, the distance between hydroxyl-oxygens of T248 and T422 suggests that with BAY-1797, aAF-2 can acquire a stable active configuration that is required for PXR activation (Fig. 8G). This leads hydrophobic interaction with F420 for 30% comparable to SR12813 but no interaction was observed with F251 and F429 in presence of BAY-1797 (SI Fig. S12). The angle between N404 and F429 vectors (h) is larger (median of $80°) compared to that of compound 100 and close to what is observed for SR12813 ( Fig. 8H; Fig. 5). This trend applies with the other angle between N404 and T422 vectors (x), with an angle (median of 114°) close to that of SR12813. Overall, the configuration of aAF-2 appears to be stabilized on LBD surface with BAY-1797.

Discussion
Since the mechanism of PXR antagonism has not been elucidated and X-ray crystallography is unable to capture the ligandinduced conformational dynamics of PXR-LBD [13], we utilized here in silico approach to disclose the putative ligand-dependent differences in conformational dynamics of PXR-LBD. Our MD data suggest ligand-dependent discrepancy in the conformational preference on different LBD regions. This discrepancy is observed in a6 region, aAF-2, a1-a2 0 , b1 0 -a3 and b1-b1 0 loop.
Earlier studies reported that a6 is less folded or very dynamic in PXR [52,53], enabling a flexible cavity to accommodate ligands of different sizes [19]. We found that compound 100 induced a tightly packed and folded conformation in this region, while a looser con- the water channel between a2 and a6 would be an entry pathway for SR12813 [40]. This region appears as a favourable ligand pathway among nuclear receptors [54]. Interestingly, our simulations suggest that water bridged interactions mediated by E321 and M323 on a6 and H407 on a10/11 appear on this site with SR12813, where no such interaction exists with compound 100. A common feature of a PXR agonist is to engage in direct interactions with H407 of the a10/11 [25,44,45,55], and H407A mutation renders PXR constitutively active. Our simulations suggest that compound 100 does not rely on interactions with H407. This appears to be related to the engagement of H407 in water mediated interaction to N404. Anami et al. [46] proposed a model for vitamin D receptor (VDR) activation or repression called ''folding-door model" to explain VDR-LBD activity. In this model, a11 cooperates with aAF-2 in a way that in the presence of an agonist aAF-2 is stabilized close to a3-helix, forming internal interactions with the a11 kink. These interactions close the door (a11). Meanwhile this kink is open in the presence of an antagonist and the a11-aAF-2 loop plays an important role in the unsuitable aAF-2 positioning for receptor activation. Here, our simulations display a stable H-bond interaction between N404 (on a11) and G278 (on a4) with SR12813, which is an infrequent interaction with compound 100. Hence, we hypothesize that with PXR the ligand-dependent N404 conformation -and the lack of water bridge interaction between H407 and a6 in presence of compound 100 -may result in a more flexible a11 with compound 100. This would result in the rearrangement of the flexible a11-aAF-2 loop. We also observed a liganddependent discrepancy in F420 and a11 interactions, and diminished hydrophobic interaction between compound 100 and this loop. Our findings agree with earlier results reported by Shizu et al. that revealed the important role of F420 (on a11-aAF-2 loop) in PXR aAF-2 stabilization [47]. Moreover, mutation of F429A, located in aAF-2, rendered PXR inactive, and was not inducible by the ligands highlighting its relevance in the binding. Altogether, we conclude that in presence of compound 100 these motifs play a role in dislocation of aAF-2 from the vicinity of a3-helix, which is demonstrated by the distance and angle calculations of aAF-2 as well as by the MSM. This dislocation of aAF-2 was also observed in the simulations of compound 100 in the presence of SRC-1 (C-100 + Co). With SR12813 a stable conformation of aAF-2 is maintained, providing a suitable platform for the co-activator binding and subsequent PXR activation. Of note, not only is compound 100 impairing binding of coactivator SRC-1 but also binding of corepressor SMRT [42]. This is in contrast to what is observed with the full antagonist SPA70, which enables recruitment of the corepressor [27]. Therefore, the induced conformational changes by compound 100 enable competitive PXR antagonism but not full antagonism, which may require that the ligand induces suitable conformations for the corepressor binding.
The adaptability of the hydrophobic subpocket in PXR is emphasized by the different conformations of Y306 and F288 that exist with SR12813 and compound 100. Meanwhile, mutation of Y306 renders PXR inactive with both ligands. This could be attributed to the loss of PXR-LBD integrity, as in multiple PXR crystal structures there exists a H-bond interaction between Y306 and H242 (located on a3-helix), highlighting the important role of this residue in PXR structural stability.
The highest flexibility of PXR-LBD was observed in a1-a2 0 loop with both compound 100 and SR12813. This flexibility is well exemplified by structural data, as this region is disordered in all publicly available PXR structures. Earlier studies show also the high degree of flexibility in this region in the presence of agonist [56,57]. Our MD data displayed somewhat lower RMSF for this region with compound 100. MSM suggested that in some meta- DMSO, 10 lM rifampicin or 1 lM SR12813 (C), or 10 lM compound 100. Data is presented as mean relative activity ± SD to DMSO-treated WT PXR from five independent experiments with technical triplicates. yyyp < 0.001 compared to DMSO-treated WT PXR analysed with two-way anova with Dunnett's multiple comparisons test. *p < 0.05, **p < 0.01, ***p < 0.001, compared to DMSO-treated respective mutant analysed with two-way anova with Tukey's multiple comparison test. stable states there exists rather state-specific conformation for this loop, while in others states more disordered configuration appears.
It is worth noting that this loop (in proximity of a2 0 ) is part of one of the proposed water channels in PXR [40] that stretches along b1 0 -a3 loop [58]. For this b1 0 -a3 loop, MSM displayed a clear deviation from the agonist-associated conformation. Further study is required to disclose the role of different conformations of the ambiguous a1-a2 0 loop together with b1 0 -a3 loop, to better understand their interrelation on the ligand binding and specificity.
Our simulations suggest that the b1-b1 0 loop is tightly packed on the LBD with compound 100. Conversely, with SR12813 this substructural element resides farther from the a3-helix. Interestingly, the W223A mutation on this outer interface renders PXR inactive with both SR12813 and compound 100. The functional relevance of this residue was also revealed by Nobel et al. [51]. Of note, this residue is part of PXR-LBD insertion [58].
For BAY-1797, that shares an identical phenylacetamide moiety with compound 100, we observed an intermediate profile in between SR12813 and compound 100. It must be noted that SR12813 is a more potent PXR agonist than BAY-1797. In the hydrophobic subpocket BAY-1797 shows a similar behaviour as compound 100. In addition, in the presence of BAY-1797 the distance between aAF-2 and a3-helix is increased compared to SR12813, but not to the extent of compound 100. Nevertheless, the distance between T248 and T422 with BAY-1797 is close to what is observed with SR12813. It was noted by Werner et al. [43] that changing the 3-chlorophenoxy with a larger and more polar substituent alleviated PXR agonism, which could occur either by diminishing PXR binding or by disrupting the aAF-2. In this regard, we suggest that along with the classical mechanism of action of a nuclear receptor, the more specific distance of T248 and T422 may be useful when analysing potential risk for PXR agonism by MD simulations.
Overall, our results revealed the adaptability and ligandspecificity in PXR conformational behaviour. More folded b1-b1 0 loop, compact a6 region, lower flexible a1-a2 0 loop, dislocated b1 0 -a3 loop and aAF-2 are associated with compound 100. Our study provides more in depth understanding of ligand-induced changes in PXR-LBD substructures. Although our compound is a competitive antagonist [42] and not a full antagonist as SPA70 [27], these results still provide a putative template for designing PXR antagonists. Finally, the identified structural key-regions provide guidance how to potentially avoid PXR agonism.

MD simulations
Modelling was conducted with Maestro (Schrödinger Release 2019-4: Maestro, Schrödinger, LLC, New York, NY, 2019), and OPLS3e force field [59,60], unless otherwise stated. For the simulations of C-100 and C-100 + SRC, we used the PXR crystal structure PDB ID: 4J5W (chain A) [61] and for the simulation of SRL and SRL + Co, we used the PXR crystal structure PDB ID: 1NRL (chain A) [22]. For BAY-1797 simulations, we applied its close analogue PXR co-crystal structure PDB ID: 6HTY (chain B) [43], and the redundant methyl group of the analogue was deleted. The residues missing in the C-terminal (G433 and/or S434) were added to the structures with Maestro tools. For SRL and C-100 systems, SRC-1 was removed. The proteins were prepared using Protein Preparation Wizard (Schrödinger LLC, New York, NY, 2019) [62]. Missing hydrogen atoms were added, bond orders were assigned using CCD database, missing side chains and loops (a1-a2 loop; residues 178-192 for 4J5W and a1-a2 loop; residues 178-191 for 1NRL and 6HTY) were filled to the structure using Prime [63], protonation states of amino acids were optimized with PROPKA (Schrödinger, LLC, New York, NY, 2019), the coactivator peptide was capped in both termini and the structures were minimized. To obtain the starting configuration for compound 100, Glide docking was conducted (Glide v. 7.7) [64,65]. Before docking, compound 100, was prepared with LigPrep (Schrödinger, LLC, New York, NY, 2019) to assign the protonation state (Epik; at pH 7.0 +/-2.0) and the partial charges. For the docking of compound 100, default settings were applied, with residues H407, Q285, S247, H327 and F429 selected to define the active site, and the docking was conducted using the standard-precision (SP) and extra-precision (XP) level of accuracy [66]. The docking resulted in an U-shape pose (mainly accommodated in hydrophobic subpocket) from the SP (docking score: À10.542; glide emodel: À86.922) and an extended pose (oriented from hydrophobic subpocket with benzyl moiety, while benzosuberone oriented towards aAF-2 region) from XP (docking score: À13.647, emodel: À98.455). We evaluated these two distinguishable poses in short MD simulations (data not shown) and the pose with extended conformation displayed better stability during the simulation, which was selected as a starting configuration for compound 100 in the production simulations. Of note, based on our later evaluation by the QM Conformer predictor tool, this extended conformation is also proposed for compound 100 as the lowest energy conformation in water [42]. The same pose was also used in C-100 + Co simulation where the SRC-1 peptide was maintained.
For the simulations, we used Desmond MD simulation engine [67]. The prepared systems were solvated in a cubic box with the size of the box set as 15 Å minimum distance from the box edges to any atom of the protein. TIP3P water model [68] was used to describe the solvent and the net charge was neutralized using K + ion with final salt concentration of 150 mM. RESPA integrator timesteps of 2 fs for bonded and near, and 6 fs for far were applied. The short-range coulombic interactions were treated using a cutoff value of 9.0 Å. Before the production simulations the systems were relaxed using the default Desmond relaxation protocol. Simulations were run in NPT ensemble, with temperature of 310 K (Nosé-Hoover thermostat) and pressure of 1.01325 bar (Martyna-Tobias-Klein barostat). For each system, simulations of five replicas with different lengths (2-4 ls) were carried out, resulting in total of 30 ls simulation data for C-100, 20 ls for SRL, 10 ls for SRL + Co, 10 ls for C-100 + SRC and 10 ls for BAY-1797 (SI Table 3; SI   Fig. S13). With C-100, additional independent replicas derived from the original simulations were run to obtain sufficient sampling for MSM.

Analysis of MD simulation data
Principal component analysis. PCA was conducted for the backbone atoms using GROMACS tools (version 2019) (gmx covar and gmx anaeig) [69]. For GROMACS analysis, the Desmond trajectories were aligned and transformed to xtc-format, keeping only backbone atoms. Figures describing the extreme motions were generated and visualized using PyMOL-script Modevectors [70].
RMSD, RMSF, protein secondary structure elements (SSE) and interaction analysis. Maestro simulation interaction analysis tool (Schrödinger, LLC) was used for these analyses. For interaction criteria, default values were used. H-bonds: cut-off of 2.5 Å for donor and acceptor atoms, donor angle of 120°and acceptor angle of 90°. Hydrophobic interactions: cut-off of 3.6 Å between ligand's aromatic or aliphatic carbons and a hydrophobic side chain, p-p interaction was defined as two aromatic groups stacked face-to-face or face-to-edge. Water bridge interactions: default cut-off of 2.8 Å for donor and acceptor atoms, donor angle of 110°and acceptor angle of 90°.
Angle and distance calculations. Maestro event analysis tool (Schrödinger, LLC) was used. Distances between specific secondary structure elements were calculated using their centers of mass with the Maestro script trj_asl_distance.py (Schrödinger LLC). For a3-helix residues 240-260 and aAF-2 residues 423-430 were used. For the distance calculation Ca atom of each residue was used. The angles were calculated using the Ca atom of N404, F281, F429 for h and Ca atom of N404, F281, T422 for x with the Maestro script event_analysis.py and analyze_simulation.py (Schrödinger LLC).
Markov state modelling. Bayesian MSM was generated with PyEMMA 2 following the general recommendations [71]. As an input, we used full protein backbone torsion angles. Time-lagged independent component analysis (TICA) was used for dimension reduction [72], using 10 ns as a lag time, and two dimensions. The output of TICA was discretized to microstates using the kmeans clustering (number of clusters set as p N), and the microstates were assigned in five macrostates (metastable states) by the Perron-cluster cluster analysis (PCCA++) method [73]. Implied timescales and Chapman-Kolmogorov test suggest a valid model (SI Fig. S14). Structure and data visualization. Structure visualization was conducted with PyMOL v.2.4 (Schrödinger LLC, New York, NY, USA). Data visualization was completed by Python 3.7, seaborn [74], matplotlib [75] and GraphPad prism (v. 8.0.0 for Windows, GraphPad Software, San Diego, CA, USA).

Cell culture
HepG2 cells (HB-8065, lot number 58341723, ATCC, Manassas, VA) were cultivated at 37°C, 5% CO 2 in MEM, which was supplemented with 10% FBS, 2 mM glutamine, 100 U/ml penicillin and 100 lg/ml streptomycin. HepG2 cells were originally obtained at passage 74, propagated and used in the experiments between passages 93 and 104. In chemical treatments, regular FBS was replaced by dextran-coated charcoal-treated FBS. Cells were routinely checked for contamination with mycoplasma by PCR (VenorGeM Classic, Minerva Biolabs, Berlin, Germany).

Transient transfections
Transient batch transfection with HepG2 was conducted using 0.6 ll JetPEI transfection reagent per well in a final volume of 25 ll (Polyplus, Illkirch, France). Per well, 0.27 lg pGL4-CYP3A4 (-7830D7208-364) luciferase reporter gene plasmid, 0.01 lg Metridia luciferase plasmid pMetLuc2-control and 0.03 lg of expression plasmids encoding human PXR or PXR mutants, were diluted in 150 mM NaCl to a final volume of 25 ll. After at least 24 h incubation, cells were treated for 24 h with 0.1% DMSO, 10 lM rifampicin, 1 lM SR12813 or 10 lM test compounds. Metridia luciferase was measured directly from 10 ll of medium with 100 ll Renilla luciferase assay solution [78] using EnSpire 2300 multimode plate reader (PerkinElmer, Rodgau, Germany) for 0.1 s. For firefly luciferase measurement, cells were lysed with passive lysis buffer (Promega, Madison, WI, USA). 10 ll of lysate was combined with 150 ll firefly luciferase assay solution [76] and activity measured with EnSpire 2300 multimode plate reader for 0.1 s. Results were normalized by dividing firefly luciferase activity by Metridia luciferase activity measured in the same well. Expression of mutants was assessed with Western blot (Supplementary Methods Fig. 1).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.