Recognition Site Modifiable Macrocycle: Synthesis, Functional Group Variation and Structural Inspection

Traditional macrocyclic molecules encode recognition sites in their structural backbones, which limits the variation of the recognition sites and thus, would restrict the adjustment of recognition properties. Here, we report a new oligoamide-based macrocycle capable of varying the recognition functional groups by post-synthesis modification on its structural backbone. Through six steps of common reactions, the parent macrocycle (9) can be produced in gram scale with an overall yield of 31%. The post-synthesis modification of 9 to vary the recognition sites are demonstrated by producing four different macrocycles (10–13) with distinct functional groups, 2-methoxyethoxyl (10), hydroxyl (11), carboxyl (12) and amide (13), respectively. The 1H NMR study suggests that the structure of these macrocycles is consistent with our design, i.e., forming hydrogen bonding network at both rims of the macrocyclic backbone. The 1H-1H NOESY NMR study indicates the recognition functional groups are located inside the cavity of macrocycles. At last, a preliminary molecular recognition study shows 10 can recognize n-octyl-β-D-glucopyranoside (14) in chloroform.


Introduction
In nature, the variety of the amino acid side chains is the key for protein to realize the recognition of small molecules and thereby access the corresponding bio-functions, such as carbohydrate protein [1-3] and sortase transpeptidases [4][5][6]. Inspired by nature, large amount of artificial molecules or complexes were synthesized to structurally or functionally mimic functional protein, for instance, foldamers [7][8][9][10][11][12][13][14][15], macrocycles [16][17][18][19][20][21][22], cages [23][24][25][26][27], hemicagepodates [28][29][30][31][32], etc. Wherein, macrocycles represent one of the most popular mimics. Thanks to their enhanced structural rigidity, macrocycles could exhibit improved metabolic stability and pharmaceutical activity comparing to acyclic biologics [33,34]. For example, biaryl-bridged macrocycles can provide high binding affinity and good target selectivity [35] and the access of these macrocycles can be realized by intramolecular Suzuki reaction [36,37], McMurry reaction [38], solid-supported synthesis [39] and oxidative coupling reactions [40]. In addition to pharmaceutical applications, there are even more applications in other bio-related fields, such as β-glucopyranose recognition [41], maltodextrin recognition [42], purine and pyrimidine bases recognition [43], oligopeptide recognition [44], water transmembrane transport [45], dye recognition for bio-imaging [46], drug delivery [47], etc. However, in most cases, the recognition sites of macrocycles, and most other host molecules, are unable to be varied because they are already encoded in the structural backbones of these molecules. This leads to a common defect of traditional macrocycles that their binding strength and selectivity are not adjustable. In addition, the limitation of varying recognition sites would also restrict the exploration of the recognition characteristics of many unexploited functional groups and the reasonable comparison of the binding properties of different functional groups in the same backbone. Herein, we propose a new macrocyclic backbone that is feasible to perform post-synthesis modification in its inner rim to achieve the variation of the recognition sites ( Figure 1). The production of the parent macrocycle (9) can be realized through convenient reactions with an overall yield of 31%. The inner rim phenolic hydroxyl group of the parent macrocycle allows the installation of diverse functional groups inside the macrocycle in a high efficient way, which is evidenced by producing the exemplified macrocycles 11-13, with 2-methoxyethoxyl (10), hydroxyl (11), carboxyl (12) and amide (13) functional groups, respectively. In the end, NMR studies gave insight into the structure of macrocycles and spatial location of the recognition functional groups. The exhibited ability of installing diverse functional groups inside a macrocycle could provide a rich imagination of the mimic of protein functions, such as molecular recognition and enzymatic catalysis. reasonable comparison of the binding properties of different functional same backbone. Herein, we propose a new macrocyclic backbone that is fe form post-synthesis modification in its inner rim to achieve the variation of th sites (Figure 1). The production of the parent macrocycle (9) can be realized venient reactions with an overall yield of 31%. The inner rim phenolic hydr the parent macrocycle allows the installation of diverse functional groups in rocycle in a high efficient way, which is evidenced by producing the exemp cycles 11-13, with 2-methoxyethoxyl (10), hydroxyl (11), carboxyl (12) an functional groups, respectively. In the end, NMR studies gave insight into of macrocycles and spatial location of the recognition functional groups. ability of installing diverse functional groups inside a macrocycle could p imagination of the mimic of protein functions, such as molecular recognit matic catalysis.

Results and Discussion
The design of macrocyclic molecules often encodes recognition sites in backbone, leading to the restriction of varying recognition functional group ing recognition sites into side chains is a feasible strategy to address the ab For example, zinc porphyrin hosts can change their side chains by using diffe aldehydes or by post-synthesis modification, thus endowing these hosts binding affinity to nitrogen-containing bases [48,49]. We also used this stra macrocycles with variable recognition sites. The backbone of the target mac signed as the combination of two tethers (for facilitating the post-synthesis of functional groups inside the cavity) and two spacers (for varying the size Through the inner rim hydrogen binding network, the tether (aromatic oligo Figure 1, in black) can be endowed with structural rigidity and an appropr to increase the possibility of the access of the desired macrocycle. Moreover, be produced through many combinations of small fractional units, such as benzene-pyridine (py-ben-py) combination shown here. Each fractional uni modified independently with the desired side chains on the outer rim for

Results and Discussion
The design of macrocyclic molecules often encodes recognition sites in the structural backbone, leading to the restriction of varying recognition functional groups. Incorporating recognition sites into side chains is a feasible strategy to address the above problem. For example, zinc porphyrin hosts can change their side chains by using different aromatic aldehydes or by post-synthesis modification, thus endowing these hosts with tunable binding affinity to nitrogen-containing bases [48,49]. We also used this strategy to access macrocycles with variable recognition sites. The backbone of the target macrocycles is designed as the combination of two tethers (for facilitating the post-synthesis modification of functional groups inside the cavity) and two spacers (for varying the size of the cavity). Through the inner rim hydrogen binding network, the tether (aromatic oligoamide section, Figure 1, in black) can be endowed with structural rigidity and an appropriate curvature to increase the possibility of the access of the desired macrocycle. Moreover, the tether can be produced through many combinations of small fractional units, such as the pyridinebenzene-pyridine (py-ben-py) combination shown here. Each fractional unit can be easily modified independently with the desired side chains on the outer rim for various purposes, such as for solubility in organic or aqueous solvent systems. More importantly, with the proposed aromatic oligoamide sequence, py-ben-py, various recognition functional groups can be easily installed on the inner rim of the parent macrocycle through simple substitution reactions (Figures 1 and 2). On the other hand, the size of the cavity of the macrocycle is dependenton the spacer (here, a biphenyl unit, Figure 1, in plum). The spacer can provide π-π and dispersion interactions and hydrophobic interactions in the case of aqueous media. By varying the length of the spacer, the size of the cavity also could be adjusted for accommodating different sized guests. Furthermore, by increasing the connection sites of the spacer (by using 3,3 -diaminobenzidine, for example), it could easily achieve the transformation of macrocyclic molecules to cage molecules to access higher binding affinity and better selectivity to mimic the protein recognition pocket.
Through one step substitution reaction on the phenolic hydroxyl groups of 9 with corresponding bromide, the target macrocycles 10 and 11 can be produced in good yi 47% and 41%, respectively. The production of macrocycle 12 with carboxyl group quired two steps, a similar substitution reaction with tert-butyl bromoacetate and lowed by the deprotection of the tert-butyl group with trifluoroacetic acid. Althou required one step more, the two-steps overall yield was even higher, reaching a yie 63%. Based on macrocycle 12, the alternation to the amide group can be easily achie Through acylation reaction with oxalyl chloride, the carboxyl group was converted t acid chloride. Afterwards, by the addition of ammonium, the macrocycle with the a groups (13) can be produced with an overall yield of 28%, starting from the parent rocycle 9. Additionally, we had also attempted to install the amide group directly ont parent macrocycle by using the corresponding bromide. However, the yield was m lower and the separation was difficult.  Figure 2. Variation of functional groups on the parent macrocycle (9). Here, shows the synthe compounds 10-13 as an example.
Afterwards, 1 H NMR studies were conducted to evidence the formation of th signed hydrogen bonding network. Here, we take macrocycle 10 as an example. 1 H N spectra of compounds 2 and 10 in CDCl3 show the benzene protons (Hben, Figure  green) are at 9.00 and 9.17 ppm, respectively. It is reasonable for the Hben of 2 to significant downfield shift since there are two nitro groups nearby. However, the H 10 are even more downfield shifted. This is probably due to the formation of a rather structure of compound 10 favoring the Hben of 10 to form rather stronger hydrogen b ing with the nearby oxygen atoms. Similar observations can also be found on the pyr protons (Hpy, Figure 3, in blue). These observations in turn imply the hydrogen bon network is formed in the designed manner. Moreover, the presence of two distinct a signals (HNH), one at 10.34 ppm and the other at 9.35 ppm in the 1 H NMR spectrum (Figure 3a, in red) is also in good agreement of the designed hydrogen bonding netw that one HNH can form two hydrogen bonds with the nearby nitrogen and oxygen a while the other HNH can only form one hydrogen bond with the nearby nitrogen a Thereby, we proved the backbone structure of the macrocycle is formed as we prop and, in turn, cluing the orientation of the installed functional groups should be tow the cavity of the macrocycle. The synthetic routes towards the parent macrocycle could be flexible and here shows one synthetic approach as an example (Scheme 1). First, the precursor of the middle unit of the tether, 2-methoxy-2-oxoethyl 3,5-diamino-4-hydroxybenzoate (3), was prepared from the commercial compound 1. By applying a mild reaction condition and limit amount of methyl 2-bromoacetate, selective substitution on the carboxyl group could be achieved with a high yield of 90%. Afterwards, the nitro group of the resulting compound 2 was quantitatively converted to amino groups by hydrogenation under the catalysis of 10% Pd/C to give the target precursor, compound 3. It should be noted that diamino-4-hydroxybenzoate is ready to be oxidized and we had experienced many failures before the present version with methyl ester side chain. The pyridine unit of the tether was prepared as the mono-ester form, compound 5, through saponification reaction. Then, the carboxyl acid group of compound 5 was converted to acid chloride by oxalyl chloride followed by the addition of the spacer precursor 6 to afford the diester compound 7. Afterwards, saponification of compound 7 was performed to disclose the carboxyl acid group, compound 8, and it was ready to conduct amino acid coupling (cyclization) with the previously prepared diamino compound 3. The final step [2+2] cyclization under the coupling reagent PyBop was very efficient (57% yield) thanks to the encoded hydrogen bonding network which can guide the fragments to be assembled in a desired way. Thereby, the present synthetic approach allows the production of the parent macrocycle 9 in a very efficient way, only one column chromatography, in less than 10 days and can be produced in gram scale with the overall yield of 31%. the previously prepared diamino compound 3. The final step [2+2] cyclization under the coupling reagent PyBop was very efficient (57% yield) thanks to the encoded hydrogen bonding network which can guide the fragments to be assembled in a desired way. Thereby, the present synthetic approach allows the production of the parent macrocycle 9 in a very efficient way, only one column chromatography, in less than 10 days and can be produced in gram scale with the overall yield of 31%. The variation of the recognition functional groups inside the parent macrocycle 9 can be easily achieved through substitution reactions. Here, we show the installation of four different functional groups as an example (Figure 2), 2-methoxyethoxyl group (10) to Scheme 1. Synthesis of the parent macrocycle (9).
The variation of the recognition functional groups inside the parent macrocycle 9 can be easily achieved through substitution reactions. Here, we show the installation of four different functional groups as an example (Figure 2), 2-methoxyethoxyl group (10) to represent the ethylene glycol functional groups, and hydroxyl (11), carboxyl (12) and amide (13) groups to mimic serine, aspartic acid and asparagines side chains, respectively. The installation of 2-methoxyethoxyl and hydroxyl groups is very straightforward. Through one step substitution reaction on the phenolic hydroxyl groups of 9 with the corresponding bromide, the target macrocycles 10 and 11 can be produced in good yields, 47% and 41%, respectively. The production of macrocycle 12 with carboxyl groups required two steps, a similar substitution reaction with tert-butyl bromoacetate and followed by the deprotection of the tert-butyl group with trifluoroacetic acid. Although it required one step more, the two-steps overall yield was even higher, reaching a yield of 63%. Based on macrocycle 12, the alternation to the amide group can be easily achieved. Through acylation reaction with oxalyl chloride, the carboxyl group was converted to the acid chloride. Afterwards, by the addition of ammonium, the macrocycle with the amide groups (13) can be produced with an overall yield of 28%, starting from the parent macrocycle 9. Additionally, we had also attempted to install the amide group directly onto the parent macrocycle by using the corresponding bromide. However, the yield was much lower and the separation was difficult.
Afterwards, 1 H NMR studies were conducted to evidence the formation of the designed hydrogen bonding network. Here, we take macrocycle 10 as an example. 1 H NMR spectra of compounds 2 and 10 in CDCl 3 show the benzene protons (H ben , Figure 3, in green) are at 9.00 and 9.17 ppm, respectively. It is reasonable for the H ben of 2 to have significant downfield shift since there are two nitro groups nearby. However, the H ben of 10 are even more downfield shifted. This is probably due to the formation of a rather rigid structure of compound 10 favoring the H ben of 10 to form rather stronger hydrogen bonding with the nearby oxygen atoms. Similar observations can also be found on the pyridine protons (H py , Figure 3, in blue). These observations in turn imply the hydrogen bonding network is formed in the designed manner. Moreover, the presence of two distinct amide signals (H NH ), one at 10.34 ppm and the other at 9.35 ppm in the 1 H NMR spectrum of 10 ( Figure 3a, in red) is also in good agreement of the designed hydrogen bonding network that one H NH can form two hydrogen bonds with the nearby nitrogen and oxygen atoms while the other H NH can only form one hydrogen bond with the nearby nitrogen atom. Thereby, we proved the backbone structure of the macrocycle is formed as we proposed and, in turn, cluing the orientation of the installed functional groups should be towards the cavity of the macrocycle. In order to gain further information of the structure, especially the location of the installed functional groups, 2D NMR studies were performed. Here, we also take macrocycle 10 as an example. First, 1 H-1 H NOESY spectrum of 10 in CDCl3 clearly shows the spatial correlation between HNH1 and the methyl protons of the spacer (HNH1↔HM, Figure  S3 in Supplementary Materials), yet the correlation HNH2↔HM is not observed, which evidences HNH2 is the HNH more remote to HM. This is consistent with the above conclusion drawn from 1 H NMR chemical shift. Similarly, other relevant correlations, such as HNH2↔Hben ( Figure S3 in Supplementary Materials), can also be observed. Then, the protons of the installed 2-(2-methoxyethoxy)ethyl group can be assigned with the help of 1 H-1 H NOESY and 1 H-1 H COSY spectra ( Figures S4 and S5 in Supplementary Materials). For instance, correlations HNH2↔H5, H5↔H4, H4↔H3 and H3↔H2 can be observed sequentially indicating H5 is the one closest to the backbone while H2 is remote to the backbone. With the identification of these key protons, the spatial correlations between the protons of the installed functional group and the protons of the parent macrocycle can be rationalized. For example, the presence of correlations HM↔H5, HM↔H4 and HM↔H3, in addition with the absence of correlations HM↔H2 and HM↔H1 (Figure 4), indicate that the 2-methoxyethoxyl group has part of the structure located in the cavity and the methoxy terminal part is located outside of the cavity. This observation is in good agreement with the size of the 2-methoxyethoxyl group. It would cause severe steric hindrance between two 2- In order to gain further information of the structure, especially the location of the installed functional groups, 2D NMR studies were performed. Here, we also take macrocycle 10 as an example. First, 1 H-1 H NOESY spectrum of 10 in CDCl 3 clearly shows the spatial correlation between H NH1 and the methyl protons of the spacer (H NH1 ↔H M , Figure S3 in Supplementary Materials), yet the correlation H NH2 ↔H M is not observed, which evidences H NH2 is the H NH more remote to H M . This is consistent with the above conclusion drawn from 1 H NMR chemical shift. Similarly, other relevant correlations, such as H NH2 ↔H ben ( Figure S3 (Figure 4), indicate that the 2-methoxyethoxyl group has part of the structure located in the cavity and the methoxy terminal part is located outside of the cavity. This observation is in good agreement with the size of the 2-methoxyethoxyl group. It would cause severe steric hindrance between two 2-methoxyethoxyl groups if whole chains were located inside the cavity. Similarly, other relevant correlations, such as H NH2 ↔H 5 , H NH2 ↔H 4 , H NH1 ↔H 5 , H NH1 ↔H 4 and H NH1 ↔H 3, can be observed as well (Figure 4). Since the length of the functional groups of macrocycles 11-13 is much shorter (i.e., hydroxyl, carboxyl and amide groups, respectively), it is reasonable to deduce that these functional groups should also be located inside the cavity of the parent macrocycle. Therefore, all NMR results evidence the formation of the designed structure.
Molecules 2023, 28, x FOR PEER REVIEW 6 of 13 methoxyethoxyl groups if whole chains were located inside the cavity. Similarly, other relevant correlations, such as HNH2↔H5, HNH2↔H4, HNH1↔H5, HNH1↔H4 and HNH1↔H3, can be observed as well (Figure 4). Since the length of the functional groups of macrocycles 11-13 is much shorter (i.e., hydroxyl, carboxyl and amide groups, respectively), it is reasonable to deduce that these functional groups should also be located inside the cavity of the parent macrocycle. Therefore, all NMR results evidence the formation of the designed structure.  At last, a preliminary molecular recognition study was conducted. By adding noctyl-β-D-glucopyranoside (14) to the CDCl 3 solution of 10, the chemical shifts of H NH1 and biphenyl (H BP ) protons of 10 were observed ( Figure 5, in red and blue), indicating the macrocycle 10 can recognize compound 14. In addition, the chemical shifts of some protons at the outer rim of the backbone were also observed, such as H ben (Figure 5, in green), implying the macrocycle should twist its backbone to adapt the shape of the guest. However, due to the steric hindrance between the 2-methoxyethoxyl side chains of 10 and the octyl side chain of 14, the binding strength between 10 and 14 is weak. In the future, we will screen diverse guest molecules to achieve higher binding affinity and selective recognition with the macrocycle 14.
phenyl (HBP) protons of 10 were observed( Figure 5, in red and blue), indicating the macrocycle 10 can recognize compound 14. In addition, the chemical shifts of some protons at the outer rim of the backbone were also observed, such as Hben ( Figure 5, in green), implying the macrocycle should twist its backbone to adapt the shape of the guest. However, due to the steric hindrance between the 2-methoxyethoxyl side chains of 10 and the octyl side chain of 14, the binding strength between 10 and 14 is weak. In the future, we will screen diverse guest molecules to achieve higher binding affinity and selective recognition with the macrocycle 14.

General Information
All commercially available starting materials and reagents were used without further purification. Anhydrous THF and DCM were obtained from commercial sources. Analytical thin layer chromatography (TLC) was performed on silica gel plates (Merck 60F254) visualized with a UV lamp (254 nm). Column chromatography was performed with commercial glass columns using silica gel 200-300 mesh (particle size 0.045-0.075 mm). High resolution electrospray ionization time-of-flight (HRESI-TOF) mass spectra were measured in the positive ion mode on an Agilent 6230 mass spectrometer.
The 1 H NMR spectra were recorded on a Bruker Avance III HD 400 in CDCl3 or DMSO-d6. Chemical shifts are reported in ppm relative to the residual solvent signal of CDCl3 (δ = 7.26 ppm) or DMSO-d6 (δ = 2.50 ppm). Abbreviations used for signal multiplicity are: s = singlet, d = doublet, t = triplet, q = quartet, m = multiplet or overlap of nonequivalent resonances, br = broad. Coupling constants, J, are reported in Hertz (Hz). 13

General Information
All commercially available starting materials and reagents were used without further purification. Anhydrous THF and DCM were obtained from commercial sources. Analytical thin layer chromatography (TLC) was performed on silica gel plates (Merck 60F254) visualized with a UV lamp (254 nm). Column chromatography was performed with commercial glass columns using silica gel 200-300 mesh (particle size 0.045-0.075 mm). High resolution electrospray ionization time-of-flight (HRESI-TOF) mass spectra were measured in the positive ion mode on an Agilent 6230 mass spectrometer.
The 1 H NMR spectra were recorded on a Bruker Avance III HD 400 in CDCl 3 or DMSOd 6 . Chemical shifts are reported in ppm relative to the residual solvent signal of CDCl 3 (δ = 7.26 ppm) or DMSO-d 6 (δ = 2.50 ppm). Abbreviations used for signal multiplicity are: s = singlet, d = doublet, t = triplet, q = quartet, m = multiplet or overlap of nonequivalent resonances, br = broad. Coupling constants, J, are reported in Hertz (Hz). 13 C{ 1 H} NMR spectra were recorded on a Bruker AVANCE III HD 400 in CDCl 3 or DMSO-d 6 (2) 4-hydroxy-3, 5-dinitrobenzoic acid (compound 1, 11.4 g, 50 mmol) and K 2 CO 3 (10.36 g, 75 mmol) were suspended in 200 mL DMF, followed by the addition of methyl 2-bromoacetate (7.1 mL, 75 mmol) at 50 • C. The mixture was stirred at this temperature for 12 h and cooled to room temperature. The reaction mixture was concentrated under vacuum and the residue was treated with water to allow the precipitation of the crude. The solid was filtered, dried and washed with methanol to afford the titled compound as a yellow solid (13.5 g, 90%); 1 H NMR (400 MHz, CDCl 3 ) δ (ppm) = 11.82 (br, 1H), 9.00 (s, 2H), 4.93 (s, 2H), 3.82 (s, 3H); 13 (3) Compound 2 (5.1 g, 17 mmol) and 10% Pd/C (0.51 g) were suspended in 100 mL anhydrous THF under N 2 . Then, N 2 was exchanged with H 2 and the mixture was allowed to stir at room temperature for 24 h. Afterwards, Pd/C was filtered and the solution was concentrated under vacuum to afford the titled compound as a dark green sticky solid (4.1 g, quantitative). Note: compound 3 was ready to be oxidized and, thus, it was immediately used in the next step without further purification and characterization. (7) Dimethyl 4-(octyloxy)pyridine-2,6-dicarboxylate (compound 4, 8.1 g, 25 mmol) was dissolved in 50 mL THF followed by the addition of 5 M NaOH aqueous solution (5 mL, 25 mmol). The mixture was stirred at room temperature for 12 h. THF was removed under vacuum and the reaction mixture was acidified by 1 M HCl. The resulting precipitation was collected and the solid was dried to give the crude of the intermediate 5. Then, the intermediate 5 was dissolved in 100 mL anhydrous DCM followed by the addition of oxalyl chloride (4.23 mL, 50 mmol) under N 2 . The solution was stirred at room temperature for four hours. The solvent and extra oxalyl chloride were removed under vacuum and re-dissolved in 50 mL anhydrous THF. Compound 6 (2.4 g, 10 mmol) and DIPEA (8.3 mL, 50 mmol) were dissolved in another 50 mL anhydrous THF. The two THF solutions were then mixed immediately and the mixture was stirred at room temperature under N 2 for 1 h. Solvent was removed under vacuum and the crude product was washed with methanol and ethyl acetate to give the titled compound as a white solid (5.6 g, 68%). 1 H NMR (400 MHz,  (8) Compound 7 (4.1 g, 5 mmol) was dissolved in 20 mL THF and followed by the addition of 5 M LiOH aqueous solution (3 mL, 15 mmol). The mixture was stirred at room temperature for one hour and THF was removed under vacuum. The residue was acidified by 1 M HCl and the resulting solid was filtered and dried to give the titled compound as a white solid (3.6 g, 90%); 1 H NMR (400 MHz, DMSO-d 6 13  Compound 3 (0.6 g, 2.5 mmol), compound 8 (2.0 g, 2.5 mmol), PyBop (3.9 g, 7.5 mmol), and DIPEA (1.7 mL, 10 mmol) were dissolved in 200 mL DMF. The mixture was stirred at 80 • C for 12 h. The reaction mixture was concentrated under vacuum and followed by the addition of 1M HCl to adjust the pH to 1-2. The resulting solid was collected by filtration and purified by column chromatography (eluent: DCM/MeOH = 10:1 v/v).