Multiscale modeling of nerve agent hydrolysis mechanisms: a tale of two Nobel Prizes

The 2013 Nobel Prize in Chemistry was awarded for the development of multiscale models for complex chemical systems, whereas the 2013 Peace Prize was given to the Organisation for the Prohibition of Chemical Weapons for their efforts to eliminate chemical warfare agents. This review relates the two by introducing the field of multiscale modeling and highlighting its application to the study of the biological mechanisms by which selected chemical weapon agents exert their effects at an atomic level.


Introduction
The first electronic computers were developed in the 1940s and 1950s. Since then they have changed dramatically with an exponential increase in computational power coupled to an exponential decrease in size. These advances have led to them becoming ubiquitous in everyday life, including many areas of scientific research.
One use of computers that is nowadays an essential part of the scientific endeavor is modeling and simulation. This involves the creation of a software model of the system being investigated and its manipulation using explicitly defined sets of rules. These rules can be arbitrary but are normally designed to mimic how the system would behave in real life, and so are based on biological, chemical or physical laws.
The importance of computation in the physical and chemical sciences has been recognized by the award of two Nobel Prizes in Chemistry. The first, in 1998, was given to Walter Kohn and John Pople for a mixture of theoretical and computational work, the former for his development of density functional theory, and the latter for his development of computational methods in quantum chemistry [1]. By contrast, the 2013 Chemistry Prize was given to three recipients, Martin Karplus, Michael Levitt and Arieh Warshel, all of whom are primarily computationalists. They were cited for their development of multiscale models for complex chemical systems in the late 1960s and 1970s [2].

Scientific models
Scientific modeling is a delicate balancing act. A computational or theoretical model of a physical system is a Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
representation that describes some aspect of the systemʼs behavior. Important properties that an 'ideal' model should aspire to include: Reliability A model should be able to reproduce or predict the properties of a system, under a given set of conditions, to a specified precision. Collaboration with experimentalists who can verify the modelʼs predictions is normally crucial in determining how reliable a model is, and how useful it is likely to be. Robustness A variety of parameters occur in the formulation of most models. The best models will be insensitve to the exact values of these parameters and will not require specific parameter sets to make them work. Of course, the quantitative results obtained from the models may change if the parameter sets change, but their qualititative behavior should remain consistent. Simplicity A model should be comprehensible, and not so complicated that dealing with and understanding it overshadows the study of the system actually being modeled. Tractability It should be possible to obtain significant results from the model without an inordinate amount of effort. The most tractable models can be solved analytically (i.e. by using pencil and paper) but most will need numerical handling or simulation on a computer. We note that it is actually surprisingly easy to develop intractable models that cannot be treated with even the most powerful available computer resources.
Many, and probably most, scientific models fall short of the ideal. Nevertheless, the above points serve as useful guidelines in their conception and subsequent refinement.

Length and time scales
Consider the following randomly-picked natural phenomena -the climate within a given region; the spread of a disease within a population or within an organism; the establishment and growth of a honey bee colony. A momentʼs thought shows the complexity of the processes that contribute to these phenomena. For example, the progress of an epidemic will depend, amongst other factors, upon how the disease enters the population and is transmitted between individuals, and the mechanisms by which it infects an individual and exerts its deleterious effects.
The same complexity is also apparent in many man-made systems. Take as a relatively simple case a battery. Its functioning depends upon the nature of its principal components -electrodes and electrolyte-the type and efficiency of the chemical reactions that occur at the electrode/electrolyte interface when charging and discharging, and the unwanted side-processes that can lead to degradations in its performance.
A common feature of these and other complex processes is that their understanding requires an examination of mechanisms that occur on different length and time scales. Thus, the molecular mechanisms responsible for battery activity and (human) disease manifest themselves at the microscopic scale (distances <1 μm, times <1 μs), whereas their effects are observable on the macroscopic scale (distances ∼1 m, times ∼1 d).

Classical and quantum mechanics
The theoretical description and modeling of complex processes is very challenging because there is no single theoretical or modeling approach that is suitable for examining the full range of length and time scales. If subatomic processes are excluded, then quantum mechanics (QM) is the appropriate physical theory to employ for atomic systems. However, QM becomes increasingly costly and difficult to apply as the number of bodies or particles in a system increases, and so it is normal to switch to classical mechanical descriptions at larger scales. Figure 1 illustrates the length and time scales at which different theoretical and modeling approaches are commonly used. We emphasize that the regions shown in the figure are approximate for two reasons. First, it is always possible to do challenging, non-standard 'benchmark' calculations on systems larger than would normally be handled with a particular algorithm, and, second, methodological advances and, in particular, increases in computer power mean that the methods' domains of applicability are constantly being extended to larger scales.
At the smallest scales, QM methods treat atoms and molecules as collections of nuclei and electrons, and can provide direct information about chemical bonding and reactivity. The most accurate ab initio QM methods are applicable to systems containing a few tens to, at most, a small few hundreds of atoms (≲10 2 ), but these must be replaced by more approximate, semi-empirical QM techniques for larger systems, normally those containing ≳10 2 -10 3 atoms or so. QM methods give way to classical mechanical techniques when a detailed picture of the electrons and nuclei in a system is not required, or for systems ≳10 3 atoms in size. At these scales, atoms are usually represented individually as point particles that obey the classical, Newtonian laws of motion, and with interactions that are treated by empirical functions. The most common of these are the force field or molecular mechanical (MM) functions that typically use a simple ball-and-spring model for the strong bonding covalent interactions between atoms, supplemented with additional terms to account for various non-bonding contributions, such as electrostatic and van der Waals forces. It is interesting to note that force fields were first developed by spectroscopists for the interpretation of infrared and Raman spectra before being adapted for molecular simulation [3,4].
Atomistic classical mechanical simulations of systems containing millions of atoms are now feasible [5]. Often, though, such a level of detail is unnecessary and a more coarse-grained representation of a system can be employed. This is conveniently done by grouping atoms together into larger, composite particles that interact via specially-adapted coarse-grained force fields, and can be simulated using standard classical mechanical algorithms. Coarse-graining can range from the slight to the substantial. Examples include the merging of molecular CH n groups into single particles, or the replacement of the amino acids in a protein or the bases in a nucleic acid by one-or two-particle representations.
Coarse-grained methods, because they provide less detail, enable simulations of larger systems than atomistic approaches. Eventually, though, even these techniques reach their computational limits and alternative simplification strategies must be found. One way that is employed in cellular automata and lattice models is to restrict particle movement so that it only occurs in certain well-defined directions, rather than arbitrarily in space. Another is to use continuum models that dispense with a particle description altogether and represent the components of a system and their properties by continuous functions. Techniques of the latter type are common in various branches of engineering (e.g. aeronautic, automotive and structural) and so are appropriate for systems of almost arbitrary size.
In the above discussion, we have implicitly assumed that going to a model with less detail permits larger systems or longer times to be simulated. This is, in general, true. For example, the optimum particle simulation algorithms have a computational cost that scales linearly with the number of particles. Thus, a reduction in the number of particles through coarse-graining allows either longer simulations of the original system, or the treatment of larger models. A second consequence of removing detail is that in doing so one normally also suppresses the highest frequency motions that the system undergoes. This is important because for most dynamical simulation algorithms-both quantum and classicalthe length of time that can be simulated is limited by the highest frequencies. As a result, simulations of a coarse-grained system can be of longer duration that those of an atomistic system of equivalent size due to the increased efficiency of the dynamical algorithms.

Multiscale algorithms
The atomistic simulation of molecular systems using both MM and QM-or, more precisely, quantum chemical (QC)techniques started to develop in the 1950s and 1960s. An especially challenging problem at the time (indeed it remains so!) was the application of these methods to understanding the behavior of proteins, and it is here that the recipients of the 2013 Chemistry Nobel Prize made their mark [6]. Thus, Warshel and Levitt, in collaboration with their co-worker Shneior Lifson, were the first to adapt atomistic MM methods to the study of proteins [7,8], whereas Karplus and coworkers were the first to use these approaches to simulate protein dynamics [9]. In addition, Levitt and Warshel devised a coarse-grained MM model to investigate how a protein folds [10].
Perhaps their most significant contribution, however, was the development of a hybrid multiscale simulation approach that combined the distinct atomistic QM, MM and continuum methods shown in figure 1 into a single algorithm. This work was started by Warshel and Karplus [11], but achieved its realization in Warshel and Levittʼs landmark study of the reaction in the enzyme lysozyme [12]. Nowadays there is a large variety of multiscale algorithms in the spirit of the one conceived by Warshel and Levitt, and they are used in many different areas of scientific computing-not just for the simulation of enzyme reactions. Nevertheless, in what follows, we shall focus on algorithms of the original Warshel-Levitt type that combine atomistic QC and MM methods.
Modern QC/MM approaches [13] resemble closely those developed in the pioneering papers of Singh and Kollman [14], and of Field, Bash and Karplus [15]. They work by dividing the system to be simulated into a number of regions, each of which is treated at a different level of theory. An example of the most common type of partitioning is shown in figure 2. In it, the atoms in the central region, for which the most detail is required, are treated with a QC method, whereas the atoms that surround it are handled with a simpler MM approximation. Finally, outside of this, there is a third, boundary region in which the atoms are not represented explicitly. This region is sometimes omitted but, when present, can be treated with a number of methods, including continuum approaches designed to mimic solvent.
There are some crucial aspects to consider when formulating and employing QC/MM, and also other multiscale, methods. One is to ensure that the interactions between the different regions are accurately represented. In the QC/MM case, these will be a mixture of covalent and non-bonding interactions. Another is to verify that a multiscale partitioning is appropriate for the problem being studied. Thus, the most common QC/MM algorithms work well for problems in which the atoms requiring high accuracy (the QC region) are relatively few and well localized, as is often so for enzyme reactions. Examples of problems that are less well adapted to standard QC/MM techniques include solution reactions in which diffusable solvent participates in the reaction, thereby making it difficult to identify atoms that should be assigned to the different regions, and long-range electron transfer or similar QM phenomena that need QC regions that are too large. Finally, we note that it remains essential to test different partitioning schemes to see how the computational results are affected, even for those cases for which QC/MM simulation is deemed appropriate.

Chemical weapons
Chemical weapons (CWs) are those that rely on the toxic properties of chemicals to inflict harm. CWs have a long history but their first use on an industrial scale was during the First World War (WWI) with the deployment of a variety of lethal and non-lethal compounds, most notably chlorine, phosgene and mustard gas. The chemical structures of these are shown in figure 3. Overall, CWs caused approximately 90000 deaths during the war, the majority of whom (∼60%) were Russian on the Eastern Front, and there were over ten times that number of non-fatal casualties, many of whom were permanently disabled or debilitated.
After WWI, well-documented cases of CW attack are sporadic until the use of poison gas by Italy in the Second Italo-Ethiopian War in 1935, and by Japan in the Second Sino-Japanese War between 1937 and 1945. In the Second World War (WWII) itself, CWs were not employed on the battlefield, but they were used extensively by the Nazis in the gas chambers of concentration camps. For example, it is estimated that over one million people were exterminated by the pesticide Zkylon B, whose active ingredient was hydrogen cyanide ( figure 3).
The Cold War period after WWII saw intense activity in the development, production and stockpiling of CWs, especially by the Soviet Union and the USA. Much of this concerned lethal CWs, particularly the nerve agents, the first of which-sarin, soman and tabun-were actually discovered in the 1930s and 1940s in Germany and weaponized, but not used, by the Nazis. The chemical structures of these molecules is shown in figure 4. There was also significant research into 'non-lethal' CWs resulting, for example, in the development of the riot-control agent, CS gas, and the rainbow herbicides, including Agent Orange, that were employed by the USA during the Vietnam War ( figure 3).
Although a direct armed confrontation between the Cold War superpowers was avoided, lethal CW attacks have occurred since WWII. Prominent examples are those by Iraq against Iran and its own Kurdish population in the 1980s (mustard gas and various nerve agents), that by the Japanese terrorist group Aum Shinrikyo on the Tokyo subway in 1995 (sarin), and those in Syria in 2013 (chlorine and sarin).
Due to the horror of the effects of CWs [16], there have been periodic attempts at their prohibition. The first such ban, agreed upon at the Hague Convention in 1899, actually predated the systematic use of CWs, whereas the experience of WWI led to the Geneva Protocol that was signed in 1925. Since then, the most significant accord has been the Chemical Weapons Convention (CWC) that forbids the production and use of CWs. It was signed in 1993 and entered into force in 1997 after it had been ratified by 65 state parties. The states who subscribe to the convention belong to the Organization for the Prohibition of Chemical Weapons (OPCW) that ensures adherence to the CWC [17]. Currently, 190 countries are members, including most recently Syria. Israel and Myanmar, have signed, but not ratified, the CWC, whereas Angola, Egypt, North Korea, and South Sudan have neither signed nor ratified.
The 2013 Nobel Peace Prize was awarded to the OPCW in recognition of its 'extensive efforts to eliminate chemical weapons'. While these have been extremely effective, and much of the worldʼs declared CW stockpile has been destroyed, significant quantities of these weapons remain, most notably with the leading powers of Russia and the USA. In addition, the relative ease of CW production, coupled with the periodic rise of rogue regimes or terrorist groups, suggests that humanity may never be totally free from CW attack. Clearly, therefore, continuing political and scientific efforts to render CWs ineffective are needed.

Nerve agent modeling
The nerve agents are among the most deadly CWs. They are organophosphorus (OP) compounds, which also form the basis of many herbicides, insecticides and pesticides. The two principal classes of nerve agent are the G-series (G for Germany), which includes sarin, soman and tabun, and the Vseries (V, most probably, for venomous) of which the most well-known is VX ( figure 4).
Nerve agents and other OP compounds exert their toxic effect by inhibiting the enzyme acetylcholinesterase (AChE) that occurs in the cholinergic nervous system and in neuromuscular junctions. A schematic of AChE and other enzymes that interact with OP compounds are shown in figure 5. AChE serves to hydrolyze the signaling compound, acetylcholine, that is released from a nerve during nerve signal transmission. Inhibition of AChE leads to an increase in acetylcholine concentration in synaptic clefts and neuromuscular junctions, resulting ultimately in the disruption of neurotransmission (figure 6). Respiratory failure and death may follow, although sub-lethal doses can also have very serious, long-term health effects.
Strategies for combatting nerve agents rely, in part, on an understanding of how nerve agents act at the molecular level. In isolation, multiscale and, in particular, QC/MM modeling cannot give a complete picture of what is going on, but it can provide important mechanistic information that is difficult or impossible to obtain from other experimental and theoretical approaches. It is this that we concentrate on in the following sections.

AChE reactions
Experiment, supported by QC/MM simulations [18][19][20], indicates that human AChE breaks down acetylcholine in a two-step reaction, a schematic of which is shown in figure 7. In the first acylation step, Ser203 of the enzyme attacks the acetyl group of acetylcholine, to give an acylated enzyme with the liberation of choline. Then, in the second deacylation step, the acyl-enzyme is hydrolyzed to regenerate the original enzyme and acetic acid. The overall reaction is extremely efficient, with deacylation being rate-limiting.
The reaction of nerve agents with AChE resembles that with its natural substrate, acetylcholine, but with the important difference that the second step that regenerates free enzyme is very much slower. This leads to the accumulation of covalently-modified enzyme that is inactive. The mechanism of nerve agent attack is thought to progress through one of three possible mechanisms that are shown in figure 8. These are disassociative, in which the bond with the leaving group breaks prior to nucleophilic attack (D N + A N ), or associative, in which a pentavalent structure is formed, either as a stable intermediate (A N + D N ) or as an unstable transition state (A N D N ) [21,22].
QC/MM simulation studies of the phosphonylation reactions between AChE and several G-series nerve agents have indicated that the mechanisms are all of the assocation/ dissocation (A N + D N ) type with a stable pentavalent intermediate. In calculations using sarin, the formation of the intermediate was the rate-limiting step [23], whereas studies of soman [24] and tabun [25] indicated that ejection of the leaving group, either F − or CN − , was the slower step.
The regeneration of free enzyme from the covalentlyinhibited form is a possible strategy for countering the noxious effects of nerve agents. For the latter, the direct hydrolysis of the phosphonylated enzyme, equivalent to the deacylation step in figure 7, is very slow (and sometimes unobservable), but it can occur for other less toxic OP compounds. In a QC/MM study of one such compound, related to the potent insecticide paraoxon (figure 4), Liu and co-workers simulated the spontaneous reactivation of dimethylphosphoryl-inhibited AChE by a water molecule [26]. They observed a mechanism of A N + D N type with a rate-limiting dissociation step.
The oximes are a class of reactivators that are more efficient than water and used to combat nerve agent poisoning. They are molecules that contain one or more −N=OH groups that are powerful nucleophiles under physiological conditions. Computational studies of tabun-inhibited AChE using QC/MM [27] or pure QC [28] methods indicate that the mechanism is again of A N + D N type, but with an efficiency and a rate-limiting step that depends upon the oxime that is being employed.
Unfortunately, there is no universal oxime reactivator that works against all nerve agents, which has led to the investigation of methods for characterizing novel oximes and the properties that would make them good reactivators. In two recent studies of this type, Ramalho and co-workers [29] examined the interaction, or docking, of various oximes with tabun-inhibited AChE using MM methods, followed up by QC/MM simulations of their reaction mechanisms, whereas Esposito and co-workers [30] took data about existing oximes and constructed statistical models, called quantitative structure-activity relationships, for predicting the properties of optimal oxime reactivators.
A problem with reactivation for many nerve agents is that it must compete with another, faster process called 'aging'. This is a secondary reaction, catalyzed by the enzyme, that dealkylates the phosphonylated serine, making it much more resistant to reactivation and rendering oxime treatment ineffective. Aging has been well-studied for soman-inhibited AChE as it is particularly rapid for this nerve agent. The reaction is a complicated one that results in several different dealkylation products. Nevertheless, QC/MM [31] and pure QC [32] simulations indicate that the most likely mechanism, shown in figure 9, is of a push-pull type in which there is an internal migration within the alkyl chain of a methyl group to form a carbenium intermediate. The latter is then rapidly hydrated to give the dominant 2,3-dimethyl-2-butanol product. Figure 6. A schematic of the process of cholinergic neurotransmission at a chemical synapse. The sequence of events is: (i) the action potential allows + Ca 2 ions to enter the pre-synaptic neuron; (ii) this triggers synaptic vesicles containing acetylcholine (ACh) to breach the membrane; (iii) these release ACh into the synaptic cleft; (iv) ACh binds to ACh receptors that cause ion channels in the postsynaptic neuron to open; (v) ACh is deactivated by AChE, which hydrolyzes it to acetate and choline; and (vi) choline is recycled by the presynaptic neuron after uptake by specific transporter proteins. A similar process occurs at neuromuscular junctions at which neurotransmission activates muscular contraction.

Bioscavenger reactions
The use of reactivators and other drugs that act directly on AChE can be quite effective at protecting the peripheral nervous system against nerve agent inhibition, but is often much less so for the AChE that occurs in the central nervous system. An alternative detoxification strategy is to intercept, or scavenge, nerve agents in the bloodstream before they reach their point of attack [33,34]. Enzymes are the preferred scavengers. The body already has several enzymes that naturally react with OP compounds and so provide some defense, but protection is enhanced by administering extra scavengers before or after nerve agent exposure.
Currently the most advanced bioscavenger is human butyrylcholinesterase (BuChE), which is localized principally in the liver but is also found in low concentrations in the blood. This enzyme catalyzes a very similar reaction to AChE but it is less specific and so can hydrolyze a wider range of esters, including some of medical importance, such as cocaine and heroin [35,36]. BuChE is termed a stoichiometric bioscavenger because, like AChE, it is covalently inactivated by nerve agents. As the predominant form of BuChE is a tetramer, consisting of four identical protein chains, each with a single active site, it means that one BuChE molecule can inactivate at most four molecules of nerve agent. Thus large doses of BuChE are required to be effective [37].
The limitations of natural, wild-type BuChE, and other enzymes, can be overcome by synthesizing improved artificial versions with modified amino acid sequences. One way to do this is via the experimental procedure of directed evolution, which mimics the process of natural selection. This consists of successive rounds of steps in which the target protein is randomly mutated and the resulting mutants are selected for specific properties, such as enhanced catalytic activity. The most successful mutants are then fed back to provide the starting points for the next round. Directed evolution can be an effective technique for improving catalysis, but it is often limited to obtaining modest improvements [38].
An alternative approach is to rationally design mutant enzymes that are better catalysts, although this is extremely challenging and cannot yet be done systematically in a reliable fashion. The problems arise due to the complexities of protein structure, and the fact that catalysis is achieved through a precise, sub-Ångstrom positioning of active site residues that significantly increase the rate of the enzymatic reaction compared to the same reaction in solution. Successful rational design requires the combination of a range of techniques, among which modeling and simulation are essential [39][40][41][42].
BuChE has been the object of a number of protein design studies. One of the more interesting mutants that was found, G117H, acted as a catalytic bioscavanger. This is an enzyme that can catalyze OP hydrolysis and yet regenerate itself after reaction, albeit in this case with an activity that was too low to be of practical use. QC/MM simulations of the reaction of the wild-type and mutant enzymes with sarin showed that the mutation distorted the active site in such a way that nucleophilic attack by a water molecule on the phosphonated serine of the covalent intermediate was facilitated [43].
Another way of circumventing the covalent inhibition of BuChE is to deliver it together with a suitable reactivator. This is known as pseudo-catalytic bioscavenging. In practice, Figure 7. The mechanism of acetylcholine hydrolysis catalyzed by acetylcholinesterase. There are two steps to the reaction, both of which occur via a mechanism of type (A N +D N ) [22] with a metastable oxyanion intermediate whose formation the enzyme favors. though, it is observed that BuChE is not very efficiently reactivated by existing oximes, and so the use of AChE itself is more effective. This strategy is a promising one, especially when oximes are employed in conjunction with mutant AChEs that resist aging [44].
Enzymes other than AChE and BuChE have also been evaluated as bioscavangers. Human proteins are preferable because they reduce the risk of an immune response when introduced into the body. Of these, human serum paraoxonase (PON1), which acts as a catalytic scavenger, is currently the most promising. This is a calcium-dependent enzyme with a six-bladed β-propeller fold structure that is very similar to the enzyme diisopropylfluorophosphatase (DFPase) that will be discussed below (see figure 5). It is thought that the physiological role of PON1 is to help protect the organism against cellular damage from various toxic agents [45], in part because it is a promiscuous enzyme that shows activity against a range of different substrates. It does hydrolyze nerve agents, including soman, tabun and VX, but not sufficiently rapidly to be an effective countermeasure. For this reason,  efforts have gone into the design of PON1 mutants that display increased OP hydrolyase activity. These studies have employed a number of approaches including directed evolution [46], and molecular docking and QC/MM calculations of the reaction of PON1 with VX [47].
Although they may suffer from immunogenicity problems, non-human enzymes can have advantages when used as bioscavangers. For example, they can be more effective hydrolyzers or more robust to degradation. The DFPase from squid is one enzyme that has been investigated. It has a structure very similar to PON1 (see figure 5) and, also like it, has been the subject of attempts to engineer mutants with improved properties [48], and of detailed QC/MM simulations to determine the mechanisms by which it hydrolyzes both DFP and sarin [49].
A second non-human enzyme that has been examined is bacterial phosphotriesterase (PTE), also known as organophosphorus hydrolase, that is shown in figure 5. It, too, promiscuously hydrolyzes a wide range of compounds. QC/ MM studies of the PTE reaction determined that the mechanism changed from an associative A N D N type for paraoxon hydrolysis to A N + D N for O,O-diethyl p-chlorophenyl phosphate, and that this flexibility could be ascribed to the plasticity of the two zinc ions in the active site [50,51]. These studies, together with that of DFPase [49], are good examples of the value of simulations in being able to elucidate the different mechanisms by which the same enzyme can hydrolyze different substrates. PTE has also been the target of rational design and directed evolution studies aimed at increasing its catalytic activity against nerve agents, with significant improvements being observed [52].

Concluding remarks
Linus Pauling was awarded the Nobel Prize in Chemistry in 1954, and the Nobel Peace Prize in 1962. In his acceptance speech in 1963, he argued that science and peace are related. As we have tried to illustrate above, the use of multiscale modeling techniques for the design of effective enzyme catalysts that can be used to protect populations from CW attack, and of drugs that protect or rapidly restore AChE function, are both scientific endeavors that help contribute to Paulingʼs vision.