Remediation of pharmaceuticals from wastewater via computationally selected molecularly imprinted polymers †

Pharmaceuticals are vital components of our daily life; however, as micropollutants, they also pose a significant wastewater treatment challenge. As the complete avoidance of pharmaceuticals is not a desired or viable solution, a targeted wastewater treatment must be implemented. Molecularly imprinted polymers can remove these contaminants from wastewater; however, determining their optimal constituents is a costly and lengthy experimental process. Here, we present a computational protocol used to design imprinted polymers for the targeted removal of fluoxetine, leveraging rigorous molecular models and simulation methods to study the crucial complexation step during the pre-polymerisation mixtures. Our molecular dynamics results validated with available experimental measurements correlate calculated radial distribution functions and predicted hydrogen bonding with experimental imprinting factors for various functional monomers. The simulated results are also analysed via Kirkwood – Buff integrals for the study of the role of functional monomers in the whole pre-polymerisation mixture. Marked dependencies of the functional monomer's carbonyl, hydroxyl and esters interactions are described, and functional monomer selection criteria using hydrogen bonding time and KBI initial slope and limiting value are proposed. This analysis offers further insights into why itaconic acid, amongst methacrylic acid, 2 (hydroxyethyl) methacrylate, acrylamide and acrylonitrile, is the optimal monomer for imprinting fluoxetine when ethylene glycol dimethacrylate is used as crosslinker and dimethylsulfoxide as solvent. Our computational protocol compiles off-the-shelf open-source software with well-established simulation methodologies to offer a viable alternative to the resource and time-consuming experimental task of choosing the best functional monomer for a given target molecule.


Introduction
4][5] One such drug, investigated here, is the anti-depressant fluoxetine (FLX), more commonly known as Prozac.FLX is a selective serotonin reuptake inhibitor approved for use by the FDA since 1987, 6 that by 2002 reached $22 billion in sales.FLX modern-day prescriptions are still very high, with over 25 million in the USA during 2018. 7The NHS monthly prescription cost analysis for England reports over 630 K prescription of FLX on March 2021. 8With such high quantities deployed, it is not surprising that FLX is found in surface waters, with typical concentrations in the microgram per litre level. 9,10Its bioaccumulation is also a problem, with trace levels moving through the food chain, eventually ingested by humans. 11,12he complete avoidance of pharmaceuticals to protect water systems is not an attainable solution to the problem.While traditional wastewater treatment plants offer some reduction of these harmful micropollutants, their complete removal is currently not achievable via conventional methods. 13Therefore, our long-term goal is to offer a technology specifically designed to capture drugs at point sources e.g., at hospital wastewater systems, 14 while our current focus is the fundamental understanding of this systems at the molecular level.With the foresee understanding, we envisage such technology to be based on molecularly imprinted polymers (MIPs) as core specific adsorbent layer in ultrafiltration membrane units for small organic molecules, such as FLX.A comprehensive review of MIP suitability in wastewater treatment can be found elsewhere. 15IPs are a class of polymer that possess cavities with high affinities towards a chosen template molecule (TM) and have been shown to have a wide variety of applications, as biosensors, 16 in the detection of small drugs molecules, [17][18][19][20][21] antibiotics, 22 and pesticides. 23,24Detection of small molecules lends itself well to wastewater treatment, 15 and environmental analysis-based applications. 25,26Targeted design of adsorbents with optimal surface functional monomers (FM) is achievable if sound comprehension of the complex molecular interactions is gained.The idealised imprinting method is outlined in Fig. 1  What is referred to as the pre-polymerisation mixture contains all MIP's components, namely TM, FM, CL and solvent molecules.How the entire system interacts before polymerisation has been one of the most important factors to consider for the generation of specific cavity sites. 27However, with so many possible compositions, the optimum system's determination is both a time-consuming and expensive experimental process.Computational modelling can leverage our understanding of molecular interactions in the prepolymerisation mixtures efficiently, leading to narrow the choices for the MIP's constituents alongside their optimal composition.
A wide variety of computational techniques are applicable for MIPs, a comprehensive appraisal of computational technics in the field has been recently compiled by Nicholls et al., 28 and specifically for the detection of antibiotics in environmental and food samples by us and coworkers. 29One prevalent approach is functional monomer selection through screening via molecular mechanics, where the TM and FM interactions in vacuum are characterised by binding energies scores. 30Quantum mechanics methods, such as density functional theory (DFT), have become more widespread, as they offer a transferable approach to determine the TM and FM energy interactions at the cost of higher resource utilisation compared to classical methods. 31,32Mostly, DFT is limited to only one TM-FM interactions since larger systems are still computationally expensive.Classical methods are less transferable but much cheaper to use allowing for the implementation of advanced sampling techniques like molecular dynamics (MD) on larger systems. 33,34MD allows for the full pre-polymerisation mixture to be simulated.Similarly to MD, Monte Carlo (MC) methods also allow for these systems to be simulated using random sampling. 35The choice of MC over MD depends on the MIP's properties of interest. 28he tendency to use computational approaches to aid the design of MIPs has increased since 2001, however it is still marginal when compared to the total number of publications in the field, as shown in Fig. 2. It is our ambition that the compiled protocol presented here will assist the computational tasks, boosting this trend, as currently less than 5% of the experimental MIP synthesis are supported by insights gained from simulations. 28,29

Setup molecular models/simulation cells
The optimised potentials for liquid simulationsall atom (OPLS-AA) 41 force field was implemented for all MIPs components.The pre-polymerisation mixture is synthesised in the liquid phase; therefore, this force field offers a suitable starting point.The topology files for all molecules were obtained from the ATB repository, 42 the specific MolID are given in the ESI.† Here, the protocol was initiated with MAA as it is the most commonly used FM for MIP synthesis.The prepolymerisation mixture simulations cells were setup following the experimental ratio of 1 : 2 : 50 : 583 of FLX : MAA : EGDMA : DMSO. 40As FLX offers two main hydrogen bonding sites (see Fig. 4), two FM per TM were chosen as an ideal start.
To simulate the pre-polymerisation mixture an initial simulation cubic cell was populated with 10 molecules of FLX, 20 FM, 500 CL and 5830 solvent molecules.All components were randomly inserted in the simulation cell.This cell undergoes a sequence of energy minimisation, NVT and NpT ensemble simulations.The simulation sequence details, i.e., order, ensemble, time steps, cut-off, thermostat and barostat, can be found in Fig. S3 and Table S3 in the ESI.†

MD simulations
All simulations were conducted using open-source software GROMACS, 43 version 2019.Force field validations are described in detail in the ESI, † including liquid densities of pure components, water surface tensions and water/octanol partition coefficients (K OW ) calculations for FLX.The effect of different FM was investigated while all other components and ratios remained the same.
A g(r) is defined as the ratio between the average number density at any given distance, r, from any atom and the density at the same distance, r, from an atom in an ideal gas, with the same overall density. 44By definition g(r) = 1 for an ideal gas, for all r.Any deviation from this value is due to intermolecular interactions. 44he Kirkwood-Buff solution theory relates molecular interactions to macroscopic properties. 45,46This theory describes structural thermodynamics over the complete range of compositions for solvents using RDFs.The KBI (eqn (1)) can be related to many physical properties, including the interaction/binding energies of atoms.KBI represents the volume per number of atoms and allows a quantitative comparison between the RDFs for the various pair interactions.
Eqn (1) shows the relation between the KBI and RDF where r is the distance between the atoms i and j in an open system, which can be approximated and applied to closed systems with R being the cut-off distance.
The specific atom pairs used here are those that are likely to give rise to HB, thus important for the FLX-FM complex formation in the pre-polymerisation step.The MIP's component chemical structures, the chosen atom pairs, and their labels are defined in Fig. 4 and 5.It is worth noticing that the calculations involving HB complexes with the FLX fluorine atoms (PF1-3, in Fig. 4) as the (trifluoromethyl) benzene HB has been proved to be too weak as measured  The GROMACS subroutine hbond 43 was used to calculate the HBs formed between the two specific atom pairs, namely the TM and FM atom pairs.HBs are determined by a specific cut-off distance and angle between donor hydrogen and acceptor, the common cut-off distance of 0.35 nm and angle of 30°, for energetically significant HBs were used here.HBs were determined at each frame of the simulation 48 giving a percentage time spent by the FLX and FM as a complex (i.e., hydrogen bonding), allowing a comparisons between the systems.

From unitary cell to production cell
We used seven different supercell sizes to determine an appropriate simulation box size (keeping the same ratios and density of elements).Out of all the simulations, Fig. 6a and b illustrates the RDFs between TM-CL (PH-EO1) and FM-CL (MH-EO1) in the ten-TM box and the 100-TM box (maximum size), both systems' peak positions and intensities are similar (disregarding the expected noise in the ten-TM case).This indicates that the smaller system is still representative of the physics generated in the larger one but at lower computing time/power for the simulations (see Table 1).
The smaller system shows more noise than that of the larger one: this is expected with only 10% the number of molecules.To improve the statistics of the ten-TM system, 18 duplicates were ran with the same parameters however each with a newly randomly generated starting configuration.The number of duplicates used was determined through time tests produced for the large system (Tables 1 and 2).
The wall time taken for a 10 ns simulation of the ten-TM cell using one node was 18 h.Based on this, running 18 duplicates of the ten-TM system is equivalent to the time/ computing power of a single 100-TM box ran for the same 10 ns on 12 computer nodes (26 hthe most efficient for the 100-TM system).This then offers a larger overall simulation time and number of molecules/configurations analysed.
The pair interactions between FLX-FM are the result of considerably fewer molecules available for sampling during the simulation (compared to that of CL), therefore the variation between duplicates is more predominant.We used  simulated 49 to solve this problem in a timely manner, the results for which can be seen in Fig. 7.The annealing temperature range was from 298-1000 K in 500 ps, this temperature maintained for 18 ns, then reduced back to 298 K at the same rate.
On Fig. 7a, without any special treatment, the two average RDFs happen to be much further away from each other than with simulated annealing, shown on Fig. 7b.The high temperature annealing step actually increases the likelihood for all molecules to interact with each other, therefore reducing variability in each RDF.
In passing, using duplicates (or here, duplicates of duplicates) improves configuration sampling: without them, we would have had no way to tell whether any of those RDFs were converged as smoothness is never a good convergence indicator for RDFs; only such repeats are.This suggests that using multiple smaller systems is better than using a single larger one.It is worth noticing that a FM-TM complex identified at higher annealing temperatures will persist for longer at the lower experimental synthesis temperature.
Considering all of the above, the annealing step was included in our protocol for all subsequent simulations, as presented in Fig. 3.

The effect of different FM
All FMs, their atom pair complex with the highest HB time and corresponding KBI limiting value are compared against the experimental imprinting factors (EIF) 40 in Table 3.All HB time and corresponding KBI limiting values can be found in the ESI † (Table S4 and Fig. S9).
The EIFs show the efficiency of the imprinting procedure, 40 determined by the ratio between the rebinding capacity of the synthesised MIP and a NIP (non-imprinted polymer) made at the same conditions without the TM.Hence a NIP contains no specific cavity sites for the chosen TM; therefore it measures only generic surface binding, if any.
By comparing the HB times and EIFs, it can be seen that ACN offers the worse EIF at 1.1 and, significantly, the shortest HB time.MAA follows as the second-lowest EIF at 2.2, with the second short HB time.Both HEMA and ACA have similar HB times along with equivalent EIFs.Lastly, IA has the best EIF while producing the longest HB time as each carbonyl group contributes synergistically to the formation of HBs.These results indicate that the FLX complex with the longest HB time will convey the more favourable selfassembly pre-polymerisation system.Moreover, all systems' preferred atom pair for HB time consisted of FLX as the hydrogen donor and varying acceptors depending on the FM (see Table S4 †).The carbonyl, as acceptor group, in IA, ACA and MAA can be ranked considering the number of carbonyl groups and substituents size attached to the carbonyl (see Fig. 4 and 5) as IA > ACA > MAA, which correlates with their  EIFs.It is worth noticing that in HEMA, the preferred acceptor was the ester oxygen, even with a carbonyl group available.Finally, ACN only offers one acceptor group, a primary amine that ranked last in respect to EIF.While the predicted HB time is a fingerprint of the quality of the TM-FM complex, the KBI analysis conveys information about the whole pre-polymerisation mixture, i.e. not only the TM-FM complex interactions.To facilitate the analysis, the full KBI curves as a function of r, the distance between the relevant atom pairs (from 0 to the maximum length of the simulation box), are plotted in Fig. 8a.The KBI value reported in Table 3 corresponds to the limiting value at r max .
For an open system, the KBI can be split into two main regions; 46 the first is the excluded volume, V ex , which is defined as the region from the centre of the atoms in which the g(r) ≅ 0, hence KBI is always negative, for a spherical particle with hard-core diameter σ, V ex = 4πσ 3 /3. 46The second region, the remaining of the KBI excluding V ex , fluctuates between negative or positive values depending on the strength of the molecular interactions.The values of the KBIs for pure Lennard-Jones (LJ) particles 46 can be used as references for the analysis of Fig. 8a.In general, the KBI limiting value for a LJ fluid with gas-like density and low interactions will be positive, while highly dense interactive fluids will be negative.The liquid densities of our studied MIP systems are similar; as such, the positive KBI limiting value for ACA implies the lowest level of interactions away from the complexation region.An ideal FM will have a high HB time, while its KBI limiting value is negative, as time will be expending exploring the complexation region rather than interacting with other molecules in the pre-polymerisation mixture, increasing the probability of an efficient TM-FM complex, this two factors once again points to IA as the best FM for this system.
Finally, focusing our attention into the KBI complexation region from 0 to 1 nm (Fig. 8b), where the attractive forces should overcome the repulsion of the atoms core in a good complex, we can notice that the KBIs' slopes, calculated between 0 to 0.5 nm are ranked as IA (a+b) = −1.36;IA a = −0.69;IA b = −0.67;ACN = −0.65;MAA = −0.64;HEMA = −0.62 and ACA = −0.60.This further characteristic is added to our FMs' computational selection criteria.

Conclusions
Fluoxetine molecularly imprinted polymers prepolymerisation mixtures with variations on functional monomer were simulated using molecular dynamics to analyse self-assembly characteristics of the template molecule/functional monomer complex.The specific functional monomers used were methacrylic acid, itaconic acid, 2 (hydroxyethyl)methacrylate, acrylamide and acrylonitrile.Analysis on the simulated systems was completed through RDFs and hydrogen bonding time, as well as comparing KBI initial slopes and limiting values with the performance of the experimental systems.The computational analysis is validated by the previous experimental work in the same system, making itaconic acid the best functional monomer for fluoxetine removal from wastewater via molecularly imprinted polymers.Moreover, these results show that this computational method offers a general approach for the rational design of MIPs prior to experimental production, specifically with the difficult choice of functional monomer.
with a) FM, b) crosslinker (CL) and c) TM, these compounds are homogenised in a given porogen solvent portrayed as a yellow background.Fig. 1 also shows the different stages of the MIP synthesis, 1) the TM and FM complexation step in the pre-polymerisation mixture, 2) the polymerisation of the CL molecules around the stabilised complex, 3) the extraction of the TM and solvent, and 4) the rebinding of the TM into the pristine imprinted cavity.All these steps are driven by intermolecular interactions, which are promoted or inhibited by the selection of the solvent, that also aid in the production of pores for the transport of the TM.

Fig. 1 A
Fig. 1 A general imprinting process for the production of MIPs.See text for details.

Fig. 4
Fig. 4 2D schematic showing the chemical structure of all MIP's components, used here, including the atom (red) and names (in brackets) used for RDF and KBI analysis.

Fig. 5
Fig. 5 2D schematic showing the chemical structure for all remaining FM studied here.

Fig. 6
Fig. 6 RDFs for the ten-TM cell (blue) and the larger 100-TM cell (red), atom pairs between both a) FLX-CL and b) FM-CL.

Fig. 7
Fig. 7 RDFs between FLX and FM, comparing the impact of a) no annealing and b) annealing simulation run.With the first average 18 individual runs (blue) and the second (red), individual runs 1 to 18 for are shown in blue and red respectively using thinner lines.

Fig. 8 a
Fig. 8 a) The full KBI curves for the FM atom pairs in Table 3. b) Zoom into the KBI complexation region from 0 to 1 nm.

Table 1
Time vs. system size tests (ran using 4 nodes × 16 CPUs)

Table 2
Time vs. number of nodes (16 CPUs per node) for the 100-TM large simulation cell

Table 3
40, KBI and EIF for all FMs studied.The specific atom pairs used for the RDF and KBI analysis are given in the second column.Both carboxylic groups within a molecule of IA atom pair contributions to HB (and KBI) were added upFMAtom pair HB time (%) KBI limiting value (nm 3 ) EIF40