Diamond Light Source: contributions to SARS-CoV-2 biology and therapeutics


 The impact of COVID-19 on public health and the global economy has led to an unprecedented research response, with a major emphasis on the development of safe vaccines and drugs. However, effective, safe treatments typically take over a decade to develop and there are still no clinically approved therapies to treat highly pathogenic coronaviruses. Repurposing of known drugs can speed up development and this strategy, along with the use of biologicals (notably monoclonal antibody therapy) and vaccine development programmes remain the principal routes to dealing with the immediate impact of COVID-19. Nevertheless, the development of broadly-effective highly potent antivirals should be a major longer term goal. Structural biology has been applied with enormous effect, with key proteins structurally characterised only weeks after the SARS-CoV-2 sequence was released. Open-access to advanced infrastructure for structural biology techniques at synchrotrons and high-end cryo-EM and NMR centres has brought these technologies centre-stage in drug discovery. We summarise the role of Diamond Light Source in responses to the pandemic and note the impact of the immediate release of results in fuelling an open-science approach to early-stage drug discovery.



Introduction
The rapid spread of a novel coronavirus led to the second pandemic of the 21 st century being declared by the World Health Organisation on the 11 th March 2020 [1]. This virus, SARS-Cov-2 [2], has been directly responsible for over 1.2 million deaths as of 1 st November 2020 [3]. Furthermore the world economy has been severely affected [4,5]. Although two previous coronavirus outbreaks causing severe disease have occurred since 2002, namely severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS), no antivirals or vaccines have been approved to combat these and probable future zoonotic coronavirus outbreaks. As observed in April by Stephen Burley this represents market failure [6]. Nevertheless the COVID-19 pandemic has generated an unprecedented response from the scientific community -currently more than 95,000 publications on SARS-CoV-2 research (https://www.who.int/emergencies/diseases/novel-coronavirus-2019/globalresearch-on-novel-coronavirus-2019-ncov). With the knowledge accumulated since the first SARS outbreak in 2002 this is transforming our understanding of coronaviruses, from genome organisation [7,8] to cell entry, replication, release and pathogenesis [9].
The speed of response from the structural biology community was remarkable. The deposition of the SARS-Cov-2 genome sequence [10][11][12][13] on the 11 th January 2020 made it routinely in the range of 3 -4 Å, and continued developments mean that genuine atomic resolution is now achievable [25,26]. Here, we highlight the central role of technical advances and large scale infrastructure in enabling the remarkable response to the pandemic, focussing primarily on the role played by Diamond Light Source.
2. SARS-CoV-2 main protease (M pro )-a prime target for drug discovery The main viral protease (M pro ) of SARS-CoV-2 plays an essential role in viral replication and has been extensively studied to identify inhibitors of SARS-CoV-1, MERS and now SARS-CoV-2 replication [27][28][29][30]. Since it is present in all coronavirus genera it has the potential to act as a target for the development of broad-spectrum antivirals [31], so an extremely potent drug (sub nM) may well also protect against the next coronavirus incursion. Being a medium sized (33.8 kDa), stable enzyme that expresses well and crystallises readily, M pro is well suited as a target for structure-based drug discovery [34,35]. M pro is encoded by nonstructural protein 5 (nsp5) and is responsible for the proteolytic cleavage at 11 sites of the polyproteins pp1a and pp1ab ( Fig. 1) [36].
Capitalising on their extensive knowledge of the structural biology of SARS nsps and in particular M pro , Zihe Rao and Haitao Yang mobilised their groups in Shanghai to target the structure of M pro and other key SARS-CoV-2 replicase proteins. Remarkably they were able to determine the structure of M pro in complex with a covalent inhibitor N3 [37] by mid-particular, whether an X-ray fragment based drug discovery campaign could be initiated using the XChem platform. In the first instance the aim was to complement the efforts in Shanghai at fast tracking repurposing of drugs while bringing us on board to lay the groundwork for developing, ab initio, antivirals targeting SARS-CoV-2 M pro . Unfortunately, the lock-down in China prevented the rapid exchange of materials, and we were forced to re-clone and overexpress using their protocols [34]. By mid-February we had obtained well diffracting, robust, ligand-free crystals of M pro suitable for soaking in low molecular weight fragments to carry out a high-throughput fragment screening XChem campaign [39], diffracting to atomic resolution (1.25 Å), with M pro arrayed within the crystal lattice with the active site accessible to compounds (Fig 2a). A large-scale screen of electrophile (a collaboration with Nir London, Weizmann Institute, Israel) and non-covalent fragments through a combined mass-spectrometry and X-ray approach resulted in 71 hits that comprehensively sampled the active site of M pro , and 3 hits at the dimer interface [14]. This wealth of information enabled us to rapidly elaborate these initial weakly binding hits into more potent inhibitors through merging covalent and non-covalent fragments (Fig. 2b). To maximise the impact of the results, we released the data immediately, through a dedicated portal (https://www.diamond.ac.uk/covid-19.html) and deposition of the data with the PDB.  [41]. This work represents a novel response to the drug discovery problem, however the next stages will be crucial, there is an urgent need to increase potency by two orders of magnitude, meanwhile and routes for development need to be established. This approach will not provide a 'quick fix' but might provide antivirals effective against this and future coronavirus zoonoses.
Repurposing approved drugs certainly can provide a fast-track to a therapy. Screening of approved drugs and those with pre-clinical safety data has been facilitated by the ReFrame library [42,43] of 12,000 compounds approved for clinical investigation in humans, including all currently approved drugs. We have collaborated with Exscientia (https://www.exscientia.ai/) to test this library against several targets. A number of approved drugs have been shown to inhibit SARS-CoV-2 M pro at low micromolar concentrations and show antiviral activity in cellular assays [37,44,45]. To date the most effective repurposed drug against COVID-19 is the anti-inflammatory dexamethasone [46].
The ribonucleotide remdesivir has been proposed as a potential anti-viral treatment for act as templates for viral messenger RNAs to express structural proteins. Which are the building blocks for new virus particles. These steps remain poorly understood but occur in the membrane-associated replication-transcription complex (RTC), formed by nsps 1-16.
Nsp12 is the RNA-dependent RNA polymerase component and requires nsp7 and nsp8 for processivity and modulation of activity. Nsp12 adopts the canonical right-handed RNA polymerase fold, linked to an N-terminal domain implicated in nucleotidyltransferase activity [48]. One nsp7 and two nsp8 molecules form a complex with each nsp12. A number of publications have revealed the overall architecture of the SARS-CoV-2 nsp12/7/8 complex, and the structural changes that occur on RNA binding [15,[49][50][51][52][53].
Early in February 2020 we ordered the synthetic genes for nsp12, nsp7 and 8, and by the 9 th of March were able to visualise protein bands on a gel. Nsp7 and nsp8 were expressed in bacteria and nsp12 in insect cells, with the complex formed by mixing nsp7, nsp8 and nsp12 in a molar ratio of 2:2:1. Initially cryoEM analyses were hampered by the tendency of the complex to fall apart during grid preparation. We overcame this problem by addition of an overhanging RNA duplex of sufficient length (34/40mer) -recapitulating the work by the group of Patrick Cramer [15] (Fig. 3). Work continues to understand if and how the polymerase forms higher order complexes with other nsps that are thought to be involved in processing the viral RNA in both replication and cap synthesis for mRNA synthesis. These include nsp10 which is a co-factor for nsp14 (N7-methyltransferase/3′-5′ exonuclease activity) and for nsp16 (2′-O-methyltransferase) as well as nsp13 (helicase) and nsp15 (Endoribonuclease/5′ triphosphatase). Alongside the structural work we have developed an in vitro RNA synthesis assay which we are using to screen for polymerase inhibitors [54].
J o u r n a l P r e -p r o o f

Spike
The coronavirus spike is trimeric with each subunit comprising some 1200 residues. It attaches to the host receptor and is cleaved into two chains (S1 and S2) to form a metastable structure which undergoes massive rearrangements to fuse the host and virus membranes, releasing the viral genome into the host cell. It is challenging to express a soluble form of the protein that remains trimeric and does not convert into the post-fusion state. Prior studies of homologues allowed these problems to be rapidly overcome and the McLellan laboratory reported a structure for the prefusion form in March [19]. Our work on this system began in February when it became clear that there was immediate value in understanding how human antibodies interact with the spike, to inform the selection of potential therapeutic antibodies. In addition, we realised that high quality spike antigen would be a invaluable reagent for the high throughput ELISA platform we were developing to test immune responses [55]. The centre of gravity for this work was the Division of  [56,58]. Work has meanwhile been carried out by others across the world, and a reasonably consistent picture is emerging from studies of monoclonal antibodies from convalescent and recovered patients [59][60][61]. The majority of antibodies that recognise spike do not neutralise. Most neutralising antibodies bind to the small receptor binding domain, which lies at the top of the spike, usually folded down, but capable of reaching up to engage the cell receptor, ACE2. These neutralising antibodies appear to work by preventing attachment to the ACE2 receptor. However the results of our collaborative research on antibodies CR3022 and EY6A demonstrated another route to neutralisation, through destabilisation of the prefusion spike [56,58] (Fig.4). Such observations are of interest since numerous clinical trials are planned or underway to investigate treatment with monoclonal antibodies (usually in a cocktail). An understanding of how antibodies bind, whether they compete, and how likely it is that the epitope might mutate to escape neutralisation, is useful in choosing an effective combination. The neutralising non-ACE2 blocking epitope is concealed when the receptor binding domain is packed down, being involved in proteinprotein interactions within the spike, perhaps reducing the likelihood of escape mutations being viable [58]. There is also considerable interest in the potential of nanobodies, camelid antibody fragments about 1/12 th the size of an IgG molecule. These may have direct therapeutic application, and their small size might allow them to be administered intranasally [62]. Work centered at the Rosalind Franklin Institute engineered nM binders through affinity maturation [56] and ongoing work aims to produce more potent molecules.

J o u r n a l P r e -p r o o f
The COVID-19 pandemic exemplifies science as a model for global cooperation. Structural biology infrastructure can strengthen communities [63] and international organisations (such as Instruct-ERIC and EMBL in Europe) play a role, as does EU funded trans-national access (e.g. iNEXT-Discovery). To provide a single exemplar, Instruct was able to respond rapidly to the COVID-19 pandemic (see https://instruct-eric.eu/covid19) and, at the request of UK government funders and Wellcome, Instruct repurposed its ARIA access management system to provide a web platform to offer SARS-CoV-2 reagents, sourced from UK research institutions, to research groups mainly in the UK and Europe. The accelerated research by providing quality-controlled reagents and reducing duplication of effort (https://covid19proteinportal.org/).
The major structural biology infrastructures have been largely able to remain functional during lockdowns, facilitated by automation and remote access. Several special, rapid access calls were launched. In Diamond all beamlines and microscopes were available for COVID-19 research, and this remains a priority, 60 applications have been awarded time, and frequently applicaitons have been granted multiple experimental sessions, sometimes spanning multiple techniques. Projects have used a range of instruments, including VMXi, I03, I04, I04-1, I24, B21 and B24, but principally either cryo-EM (with 174 days of microscope time used), or crystallography. Notably ~13,000 XChem data sets have been collected, covering in addition to that described above, eight further fragment campaigns. For the future there is tremendous scope for correlative imaging and cryo-tomography. A glimpse of the prospects comes from Peijun Zhang's group, who have imaged the SARS-CoV-2 infection process at the cellular, sub-cellular and molecular level using a combination of soft X-ray tomography, cryoFIB/SEM, cryo-electron tomography and sub-tomogram averaging using J o u r n a l P r e -p r o o f correlative workflows. These methods are a major focus for Diamond and we aim to accelerate development by co-location [64].
In summary, the rapidity with which relevant structures were determined demonstrates that structural biology is fit for purpose as a central component of the response to emerging/re-emerging viruses. Much will be done to hone the methods, but the larger challenge is to join up all the dots beyond structural biology, avoid market failure and produce effective drugs far more rapidly than has been possible in the past. Some of these challenges are discussed in [6]. We argue that there is an urgent need to establish partnerships to identify and remove bottlenecks and de-risk the process, from surveillance through to virology, biochemistry and structural studies back to chemistry, virology and regulation. This is a moral imperative, and in these difficult times it was good to hear the UK Prime minister recognising the divisions caused by COVID-19 in an address to the United Nations on the 26 th September 'here in the UK … we are determined to do everything in our power to work with our friends across the UN, to heal those divisions and to heal the world'.
J o u r n a l P r e -p r o o f