Coding and decoding libraries of sequence-defined functional copolymers synthesized via photoligation

Designing artificial macromolecules with absolute sequence order represents a considerable challenge. Here we report an advanced light-induced avenue to monodisperse sequence-defined functional linear macromolecules up to decamers via a unique photochemical approach. The versatility of the synthetic strategy—combining sequential and modular concepts—enables the synthesis of perfect macromolecules varying in chemical constitution and topology. Specific functions are placed at arbitrary positions along the chain via the successive addition of monomer units and blocks, leading to a library of functional homopolymers, alternating copolymers and block copolymers. The in-depth characterization of each sequence-defined chain confirms the precision nature of the macromolecules. Decoding of the functional information contained in the molecular structure is achieved via tandem mass spectrometry without recourse to their synthetic history, showing that the sequence information can be read. We submit that the presented photochemical strategy is a viable and advanced concept for coding individual monomer units along a macromolecular chain.

The exact mass of the measured sample is matching with the assigned species (see Supplementary  Table 19), however there is likely an additional signal from a double charged cluster of two molecules. The exact mass of the measured sample is matching with the assigned species (see Supplementary  Table 22), however there is likely an additional signal from a double charged cluster of two molecules. The exact mass of the measured sample is matching with the assigned species (see Supplementary  Table 25), however there is likely an additional species ionized with one additional proton. The exact mass of the measured sample is matching with the assigned species (see Supplementary  Table 27), however there is likely an additional signal from a double charged cluster of two molecules.     Table 29. Summary of the identified ions from the MALDI-ToF-ToF spectrum for compound 7c (noted (M1)3-X-(M1)3). Ions terminated with maleimide end-groups are denoted "Mal" and those with a furan protected maleimide end-group "Fm". The theoretical m/z value is provided as exact mass as first entry, and the nominal mass in brackets (when not otherwise indicated, exact and nominal value are identical). The identified structures are depicted in Supplementary Fig. 150.

Name
Ion  Table 30. Summary of the identified ions from the MALDI-ToF-ToF spectrum for compound 8a (noted (M2)(M1)2-X-(M1)2(M2)). Ions terminated with maleimide end-groups are denoted "Mal" and with furan protected maleimide end-group "Fm". The theoretical m/z value is provided as exact mass as first entry, and the nominal mass in brackets (when not otherwise indicated, exact and nominal value are identical). The identified structures are depicted in Supplementary Fig. 153.  Table 31. Summary of the identified ions from the MALDI-ToF-ToF spectrum for compound 9b (noted (M1M2M1)-X-(M1M2M1)). Ions terminated with maleimide end-groups are denoted "Mal" and with furan protected maleimide end-group "Fm". The theoretical m/z value is provided as exact mass as first entry, and the nominal mass in brackets (when not otherwise indicated, exact and nominal value are identical). The identified structures are depicted in Supplementary Fig.  156.    (1), 3a,4,7,7a-tetrahydro-4,7-epoxyisobenzofuran-1,3-dione (2), and 1,6-hexanebismaleimide (3) were synthesized according to the literature. The synthesis method of the Boc-protected compound 1a was applied as reported and adapted to 3a, (4) as well as the transformation to 1b and 3b into furan-protected maleimide functional molecules. (5) This method was adapted furthermore for the synthesis of 2b, 4b, 5b and 6b. Boc-deprotection reaction was conducted via protocol reported from Burkart et al. and applied for all monomer intermediates (1c-6c). (6) The lysine-Fmoc-NBoc esterification and Fmoc-deprotection was reported (7) for the compounds 2a and 2b, and adapted to the intermediates 4a, 4b, 5a, 4b, 6a and 6b.

Name
The mass-spectrometric data of the current study are depicted in several spectra and tables, firstly to evidence the monodisperse character of the synthesized macromolecules, secondly because different fragments and counter ions were identified. For MALDI-ToF-ToF experiments, a different numbering is introduced for clarity at the end of the Supporting Information Section, and should not be confused with the numbering followed within the rest of the document. In the MALDI-ToF data collation tables, the theoretical and experimental exact mass of the identified molecules are compared. However, for higher m/z-values, only the nominal mass (provided in brackets) is experimentally accessible and is comparable to the calculated theoretical nominal mass (also in brackets).

Principle of Photo-caged Diene (Photoenol) Ligation
The benzaldehyde (photo-caged diene) entity absorbs UV-light, being excited to a higher energy level (excited triplet state, unpaired electrons with parallel spins) after intersystem crossing (ISC). A biradical is formed and is subsequently stabilized in the form of an ortho-quinodimethane. The reaction of this diene with an ene leads to the generation of an aromatic ring which is the driving force of the cycloaddition.

Reactions Performed in the Flow Reactor
For the sequential synthesis of molecules for the first and second sequence order, the photoreaction was performed in a flow reactor designed by the authors (refer to Supplementary Fig. 38). The reactor is made of a quartz spiral (length 2 m, inner diameter 7 mm) and the reaction mixture was pumped by a peristaltic pump 5201 Heidolph. The solution was pumped at 14 rpm via a 0.7 mm Viton tube corresponding to a flow of 5 mL min -1 , resulting in a residence time of approx. 45 minutes. The irradiation was performed by a PL-L lamp (36 W) with a total emission in UV-A range of 91.8 W m -2 (1.4 W m -2 in UV-B range, 1.8 W m -2 in UV-C range) at a distance of ca. 5 cm from the spiral. The emission spectrum can be found in Supplementary Fig. 39. Purging with inert atmosphere and filling of the reactor with dry DCM was a prerequisite prior to irradiating the actual reaction mixture. The reaction mixture was preliminary prepared with dried compounds, stored under inert atmosphere and dissolved in dry DCM at a concentration of 1,6-hexanebismaleimide of 5 mmol L -1 (first sequence) and of 2.5 mmol L -1 of the previously synthesized dimer (second sequence), and of 1.25 mmol L -1 of previously synthesized tetramer (third sequence).

Reactions Performed in Batch
For smaller amounts, the reaction mixture was irradiated in Pyrex vials filled with 5.0 mL solution after purging with inert gas and exposed to irradiation over 45 minutes with the same PL-L lamp (36 W, emission spectrum available in Supplementary Fig. 39) at a distance of ca. 5.0 cm. The concentration was similar as for the photoreactions performed under flow conditions, i.e. 5.0 mmol L -1 for the first sequence order, 2.5 mmol L -1 for the second, 1.25 mmol L -1 for the third and 0.325 mmol L -1 for the fifth. The reaction was performed while the vials were placed on the rotating support without stirring.

NMR Characterization of the Copolymers
Due to the complexity of the synthesized molecules at high sequence orders and to enhance the accessibility of the NMR spectra for the reader, the proton chemical shift assignments of the NMR spectra are clearly identified for dimer in an exemplary analysis. NMR spectra of the homopolymer series 7a-7 from monomer 1 constitute reference spectra for the polymer chain backbone. For other copolymers, only characteristic peaks from introduced side-functions or chain termini are presented. Due to the symmetrical geometry of the 1,6-bismaleimide based polymers, only one arm of the molecule is represented, the other arm being strictly identical.

MALDI-ToF-ToF Mass-Spectrometry
Each hexamer 7c, 8a, 9b and 10c was analyzed via MALDI-ToF-ToF mass-spectrometry in order to decode the sequence order. Due to the complexity of the obtained spectra, a different numbering from the rest of the Support Information Section of the identified ionic species is introduced. Considering the first identified peak (noted M Na in the spectrum) with the highest m/z-value, one can identify the molecule of interest after the loss of two furan moieties -the end-groups is denoted in that case "-Mal" -and different counter ions (K + , Na + and H + ). The molecule undergoes different fragmentation orders corresponding to the loss of one monomer unit classified in the corresponding table. In some cases, the defragmentation route can be divided in different pathways and generate specific ions which either did not undergo the loss of a furan end-group (noted in that case "-Fm") or result from a partial fragmentation of the monomer unit. For clarity, a mechanism for the different fragmentation pathways is given with the spectrum, as well as the identified structures for ions generated from the third to the sixth fragmentation. For clarity, only species ionized with a H3O + counter ion are given a specific name and are represented in the spectra and the fragmentation pathway. Thus, identical molecules ionized with water molecules are only indicated in the analytic tables to fulfill the complete characterization of the MALDI-ToF-ToF spectra.

Synthesis of
All operations were conducted under argon atmosphere. 1b (1.82 g, 5 mmol, 1.00 eq.) was dissolved in dry DCM (225 mL) and cooled to 0 °C. TFA (11.55 mL, 149.92 mmol, 32 eq.) was slowly added and the reaction mixture was stirred for 2.5 h until thin layer chromatography (TLC) showed the absence of starting material. The solvent and TFA were subsequently removed in vacuo at 20 °C by threefold dissolution of the residue in DCM and subsequent evaporation. The resulting residue was weighed in order to quantify the excess amount of residual TFA to determine the excess amount of Et3N necessary for the next synthetic step. (Yield: 100 %, 2.05 g). 1 N-(6-(1,3-Dioxo-1,3,3a,4

Synthesis of Methyl 2-((((9H-fluoren-9-yl)methoxy)carbonyl)amino)-6-((tert-butoxycarbonyl)amino)hexanoate 2a
All operations were conducted under argon atmosphere. Lysine-Fmoc-NBoc (5.8 g, 12.4 mmol, 1.00 eq.) and HOBt (2.5 g, 18.5 mmol, 1.50 eq.) were dissolved in dry MeCN (60 mL) and cooled to 0 °C. Dry MeOH (50 mL) and Et3N (2.6 mL, 20.5 mmol, 1.65 eq.) were added dropwise to the mixture. EDC·HCl (2.7 g, 14.1 mmol, 1.15 eq.) was dissolved in dry MeCN (40 mL) and dry MeOH (10 mL) and was subsequently added dropwise to the mixture. The reaction mixture was stirred for 3 h at 0 °C. Subsequently, the reaction mixture was allowed to warm to ambient temperature and was diluted with DCM (100 mL). The mixture was acidified with 0.25 N HCl (50 mL) and dispersed in saturated NaHCO3 (50 mL) in order to precipitate a white solid. After filtration, the filtrate was washed once with saturated NaHCO3 (50 mL) and the aqueous layer was extracted twice with DCM (50 mL). The combined organic layers were washed twice with brine (50 mL) and dried over MgSO4. After filtration, the solvent was removed in vacuo to provide the product as a yellow oil.

Synthesis of Methyl 2-amino-6-((tert-butoxycarbonyl)amino)hexanoate 2b
All operations were conducted under argon atmosphere. 2a (5.81 g, 12.0 mmol, 1.00 eq.) was dissolved in dry MeCN (100 mL) under inert atmosphere. Piperidine (10 mL, 8.62 g, 101 mmol, 8.40 eq.) was added dropwise to the solution and the mixture was stirred for 3 h at ambient temperature. After removal of the solvent under reduced pressure, the crude product was purified by flash chromatography after solid deposition from DCM employing different eluents to remove side-products (Hex:EA 4:1, EA, DCM). The target compound was collected employing a gradient of DCM:MeOH from 98:2 to 95:5 (stained via KMnO4). After evaporation of the solvent, the product was obtained as a yellow oil and was employed for the next reaction step without any further purification.

Synthesis of Synthon 4 (Monomer M4)
The synthesis of 4 was conducted in a similar fashion as for 2, except for the first amide reaction where 6-hydroxyhexylamine was employed instead of methanol.

Synthesis of
The synthesis of 5a was performed similarly to 2a. All operations were conducted under argon atmosphere. Lysine-Fmoc-NBoc (2.30 g 4.9 mmol, 1.05 eq.), 6-(((-adamantan-2-yl)methyl)amine (725 mg, 4.67 mmol, 1.00 eq.), HOAt (840 mg, 5.60 mmol, 1.10 eq.) and N,N-diisopropylethylamine (0.119 mL, 0.06 g, 0.47 mmol, 0.10 eq.) were dissolved in dry DMF (75 mL) and cooled to 0 °C. EDC·HCl (980 mg, 5.13 mmol, 1.10 eq.) was added and the reaction mixture was stirred for 2 h at 0 °C and additionally overnight at ambient temperature. The reaction mixture was diluted with EA (250 mL), washed two times with 1 N HCl (50 mL), two times with saturated NaHCO3 solution (50 mL) and with brine (80 mL). The organic layer was dried over MgSO4 and the solvent was removed in vacuo to provide the target compound as a white solid. The compound was employed for the next reaction step without any further purification.

Synthesis of Synthon 6 (Monomer M6)
The synthesis of 6 was conducted under same conditions as for 2, except for the first amide reaction carried out with 4-fluorobenzyl)amine instead of methanol.

Synthesis of N-(6-((Adamantan-1-ylmethyl)amino)-5-(2,5-dioxo-2,5-dihydro-1H-pyrrol-1-yl)-6oxohexyl)-4-((2-(dimethoxymethyl)-3-methylphenoxy)methyl)benzamide 5e
The reaction was performed in a similar fashion as for 1d with 5 (58.7 mg, 84.6 µmol, 1.00 eq.), TosOH (1.3 mg, 6.8 µmol, 0.08 eq.), TMOF (35 µL, 0.34 mmol, 4.00 eq.) and anhydrous MeOH (1.0 mL) to provide the product as a pink oil. Due to the complexity of the NMR spectrum, an indication of the proton assignment to the different chemical shift is visualized in Supplementary Fig. 36 where the expected resonances of protons 1, 20 and 22 are expected as observed for the related compounds. 1 (1.03 g, 2.0 mmol, 2.00 eq.) and 1,6-hexanebismaleimide (345.0 mg, 1.25 mmol, 1.25 eq.) were evacuated and purged with inert atmosphere prior to be dissolved in dry DCM (500 mL) and irradiation under flow conditions. The conversion was quantitative (as determined via NMR spectroscopy) regarding monomer 1 (characterized by the disappearance of the aldehyde signal at low field strength). Due to side reactions of the 1,6-hexansbismaleimide, the reaction mixture was precipitated in Et2O and purified via flash chromatography by normal phase separation. The material was deposited on a dry short precolumn and separated over a SNAP Ultra column (25 g) from DCM with a MeOH gradient (25 mL min -1  Synthesis of 7b (M1)2-X-(M1)2 7a (540.5 mg, 0.413 mmol, 1.00 eq.) was thermally treated under vacuum to induce the furan deprotection of the maleimide end-groups. The reaction was carried out at 115 °C under vacuum overnight protected from light. Monomer 1 (265.7 mg, 0.867 mmol, 2.10 eq.) was added, the solids were dried and the reaction flask was purged with nitrogen prior to dissolution in dry DCM (165 mL). The reaction was performed under flow conditions and the conversion was determined via NMR spectroscopy. 7b was precipitated in Et2O and purified via preparative chromatography. A gradient was performed with eluent A (MQ water:THF 75:25, 20 mL min -1 ) and eluent B (THF). From the initial isocratic condition (100 % A, 3 CV), the gradient was increased stepwise with 10 % B over 1 CV. After an isocratic step with 10 % B for 3 CV, the gradient was increased to 20 % B over 1 CV. 7b was collected for the next gradient  Supplementary  Table 8.

Synthesis of 7d (M1)2
All operations were conducted under argon atmosphere. To a solution of the furan deprotected and acetal protected monomer 1d (47.1 mg, 95.3 µmol, 1.00 eq.) dissolved in anhydrous DCM (15 mL), monomer 1 (51.7 mg, 0.1 mmol, 1.05 eq.) dissolved in anhydrous DCM (25 mL) was added (5.0 mmol L -1 ). The monomer solution was subsequently filled into sealed and previously degassed head space vials (5 mL) and purged for 5 minutes with argon, to be finally exposed to UV light (PL-L, 355 nm) for 45 minutes (batch conditions). After irradiation, the single fractions were gathered and the solvent was removed under reduced pressure. The purification was performed via reverse phase flash chromatography (SNAP C18 12 g, 12 mL min -1 ) with MQ water and a THF gradient of 1.0 % per CV. Isocratic phases of a constant THF content have been applied at 20, 25, 30, 35 % (5 CV respectively), and 40 % (35 CV). The product was eluted as the THF content reached 40 % in the mobile phase. An acidic workup was not performed since the aldehyde functionality was regenerated after the purification. (Yield: 58.2 %, 53.6 mg). 1

Synthesis of 12b (M3)3
All operations were conducted under argon atmosphere. The reaction was conducted under batch conditions similarly to the synthesis of dimer 12a. The furan deprotected and acetal protected monomer 3d (36.0 mg, 65.6 µmol, 1.50 eq.) and dimer 12a (45.0 mg, 43.7 µmol, 1.00 eq.) were mixed and dissolved in anhydrous DCM (25 mL). The purification was carried out via reverse phase flash chromatography (SNAP C18 12 g column 12 mL min -1 ) with water and a THF gradient of 1.0 % per CV. Isocratic phases of a constant THF content have been applied at 20 and 25 % (9 CV respectively), 30 % (15 CV), 35 % (24 CV) and 40 % (12 CV). The product was eluted as the THF content reached 40 % in the mobile phase. The fractions were gathered, the solvent was removed under reduced pressure and the residue was precipitated in Hex at ambient temperature. An acidic workup was not performed since the aldehyde functionality was regenerated after the purification. (Yield: 61 %, 28 mg). 1 Supplementary Table 27.

Synthesis of the Dimer Acetal 12c
All operations were conducted under argon atmosphere and protected from light. The synthesis of 12c was performed similarly to 1d. Dimer 12a (25.6 mg, 16.6 µmol, 1.00 eq.) was heated at 115 °C for 8 h under vacuum and was kept under vacuum at ambient temperature overnight for two cycles to quantitatively remove the furan protecting group. The product was employed without any further purification. Furan deprotected dimer 12a (24.0 mg, 23.3 µmol, 1.00 eq.), TosOH (0.4 mg, 1.9 µmol, 0.08 eq.) and TMOF (10 µL, 93.3 µmol, 4.00 eq.) were suspended in anhydrous MeOH (1.0 mL). The reaction suspension was heated to 40 °C and the reaction mixture was stirred for 24 h. Afterwards, the crude reaction mixture was filtrated over silica (anhydrous DCM/MeOH 4:1, 1.0 % Et3N) and the solvent was removed under reduced pressure to provide the target compound as an orange oil. 12c was immediately employed for the synthesis of 12.

Characterization of 12 (M3)5
The reaction was conducted under batch conditions similarly to the synthesis of dimer 12a. Under inert atmosphere, furan deprotected and acetal protected dimer 12c (24.0 mg, 23.3 µmol, 1.50 eq.) and trimer 12b (23.5 mg, 15.5 µmol, 1.00 eq.) were mixed and dissolved in anhydrous DCM (10 mL). Purification was carried out via reverse phase flash chromatography (SNAP C18 12 g column 12 mL min -1 ) with MQ water and a THF gradient of 1.0 % per CV. Isocratic phases of a constant THF content have been applied at 25 % (9 CV), 30 % (15 CV), 35 % (21 CV) and 40 % (18 CV). The product was eluted as the THF content reached 40 % in the mobile phase. The fractions were gathered, the solvent was removed under reduced pressure and the residue was precipitated in Hex at ambient temperature. An acidic workup was not performed since the aldehyde functionality was regenerated after the purification. (Yield: 20 %, 7 mg). ESI-MS (C37H36FN3NaO7 + ) calculated: 676.243 found: 676.243