Discovering the Solid-State Secrets of Lorlatinib by NMR Crystallography: To Hydrogen Bond or Not to Hydrogen Bond

Lorlatinib is an active pharmaceutical ingredient (API) used in the treatment of lung cancer. Here, an NMR crystallography analysis is presented whereby the single-crystal X-ray diffraction structure (CSD: 2205098) determination is complemented by multinuclear ( 1 H, 13 C, 14 / 15 N, 19 F) magic-angle spinning (MAS) solid-state NMR and gauge-including projector augmented wave (GIPAW) calculation of NMR chemical shifts. Lorlatinib crystallises in the P2 1 space group, with two distinct molecules in the asymmetric unit cell, Z ′ = 2. Three of the four NH 2 hydrogen atoms form intermolecular hydrogen bonds, N30-H…N15 between the two distinct molecules and N30-H…O2 between two equivalent molecules. This is reflected in one of the NH 2 1 H chemical shifts being significantly lower, 4.0 ppm compared to 7.0 ppm. Two-dimensional 1 H-13 C, 14 N- 1 H and 1 H (double-quantum, DQ)- 1 H (single-quantum, SQ) MAS NMR spectra are presented. The 1 H resonances are assigned and specific H-H proximities corresponding to the observed DQ peaks are identified. The resolution enhancement at a 1 H Larmor frequency of 1 GHz as compared to 500 or 600 MHz is demonstrated.


Introduction
Lorlatinib (Lorbrena) 1, is an oral active pharmaceutical ingredient (API) developed by Pfizer and used in the treatment, as an inhibitor, of anaplastic lymphoma kinase (ALK) and C-ros oncogene (ROS1) kinase for patients with ALK-positive or ROS1-positive non-small cell lung cancer (NSCLC). 1,2 Since receiving its first global approval in 2018 in Japan, 1 it is now widely prescribed around the world. The molecule was developed to treat patients that have developed resistance to existing therapies for ALK-positive tumours by generating a macrocyclic compound. 2 A structure-based drug design approach was implemented in order to prepare the desired macrocyclic structure, functioning as a highly potent ALK inhibitor with good absorption, distribution, metabolism and excretion. 3 Both the phase one and two trials findings indicated that the use of Lorlatinib was an effective therapeutic strategy for ALKpositive NSCLC for patients who previously showed resistance to other forms of treatment. 4,5 Lorlatinib has been recognised as a 2021 ACS 'Hero of Chemistry'. 6,7 While a crystal structure of the HCl salt of Lorlatinib has previously been reported (CSD ZOJLIV) 3 , it is the free form that is present in Lorbrena tablets. 8 This paper presents a single-crystal X-ray diffraction structure (CSD: 2205098) determination for this free form of Lorlatinib. This is complemented by a nuclear magnetic resonance (NMR) crystallography analysis that combines experimental magic-angle spinning (MAS) with density-functional theory (DFT)-based calculation of NMR parameters. We note that Ou et al 9 have reported a crystal structure of Lorlatinib (CSD: 1987322), as one of 20 clinical drugs, in a paper describing a method for cultivating single crystals from melt microdroplets. However, this crystal structure is for racemic Lorlatinib, not the R enantiomer as in the pharmaceutical formulation.
MAS NMR is being increasingly applied to the characterisation of APIs in the solid state. [10][11][12] In particular, these applications benefit from recent technical advances with regards to ever faster MAS frequencies. [13][14][15] In particular, this benefits the application of 1 H detected MAS NMR, notably two-dimensional homonuclear and heteronuclear experiments, to small and moderately sized molecules such as pharmaceuticals. A key experiment is the 1 H-1 H Double Quantum (DQ) MAS NMR experiment, 16,17 in which specific 1 H-1 H proximities are observed via the homonuclear recoupling of 1 H-1 H dipolar couplings, for example using the Back-to-Back (BaBa) 18,19 pulse sequence. Powerful insight, notably into key hydrogen-bonding interactions, is obtained by combining the homonuclear 1 H− 1 H DQ MAS NMR experiment with fast MAS heteronuclear experiments, specifically 1 H- 13 C CP HETCOR [20][21][22][23][24] and 14 N− 1 H heteronuclear multiple-quantum correlation (HMQC). [25][26][27][28] In this context, it is informative to give an overview of applications of MAS NMR to pharmaceuticals and related self-assembled moderately sized organic molecules. 1 H− 13 C CP HETCOR experiments have been applied to complex CO2-based organic framework materials, 29 nanocrystals 30 and salt forms, 31 as well as to provide constraints to enhance crystal structure prediction. 32 The 1 H− 1 H DQ MAS NMR experiment has found application to a cysteine-based formulation, 33 the diethylcarbamazine citrate salt which is used in the treatment of filariasis caused by worm infection, 34 the multiple-sclerosis treatment teriflunomide, 35 as well as for identifying differences between crystalline and amorphous indomethacin. 36 Other applications include 15 N-labelled rosette nanotubes, 37 co-crystals formed by ball milling 38 and quartet and ribbon-like self-assemblies. 39 Aromatic pi-pi interactions can also be probed by the 1 H− 1 H DQ MAS NMR experiment, for example in diphenylalanine nanotubes. 40 In related pulse sequence development, it has been shown that excipient signals can be edited out of 1 H MAS NMR spectra of drug formulations using selective saturation pulses, 41 while a selective DQ recoupling pulse sequence has also been applied to desmotropic forms of albendazole. 42 The 14 N− 1 H HMQC experiment has been applied, for example, to amorphous solid dispersions of an API, 43 paclitaxel loaded polymer micelles 44 and for probing the tautomeric form of azo dyes. 45 For fluorinated drug molecules, 1 H- 19 F heteronuclear correlation  experiments can be performed, for example for the antiemetic aprepitant and its formulation 46 Emend as well as a range of blockbuster drugs. 47 As exemplified in this paper, insight into solid-state forms of an API is obtained by applying an NMR crystallography approach. [48][49][50][51][52][53][54][55] Specifically, experimental solidstate NMR data is combined with the calculation of NMR parameters, notably, as in this study, using the DFT-based gauge-including projector augmented wave (GIPAW) approach 56,57 that is applicable to periodically repeating crystalline solids. As is the case for the solid-state structure of Lorlatinib studied here, the number of molecules in the asymmetric unit cell (Z') can be greater than one. This is readily identified by an NMR crystallography approach, whereby the number of peaks per atom site can be counted, notably in a 13 C cross-polarisation (CP) MAS spectrum. 50,51 A recent example for a pharmaceutical salt shows how specific packing effects can be understood in this way. 58 In further examples of applying NMR crystallography to pharmaceuticals, solidstate NMR can probe proton transfer that distinguishes co-crystal or salt formation [59][60][61][62] and test the validity of structures determined by powder X-ray diffraction. 63 Moreover, a GIPAW-trained machine learning approach for the prediction of chemical shifts, ShiftML, has recently been presented. [64][65][66] This enabled the characterisation of an amorphous drug, 67 taking advantage of the marked sensitivity of NMR chemical shifts to local structure. 68 In this paper, a multinuclear 1 H, 13

Sample Preparation
Lorlatinib was supplied by Pfizer. Material was prepared according the procedure listed in patent US 10420749. 69

Single-Crystal X-ray Diffraction
The single crystal X-ray structure of Lorlatinib was determined at 100 K in the monoclinic system, space group P21. Data collection was performed on an Agilent SuperNova diffractometer using a CuK X-ray source. 76328 reflections were collected using the omega collection method. The structure was solved using the SHELXTL program 70 and full-matrix least-squares refinement on F 2 . All non-hydrogen atoms were refined anisotropically. All carbon-bound hydrogens were placed in geometrically calculated positions and refined using a riding model. Hydrogens atoms on heteroatoms were refined with isotropic displacement parameters. The final Rindex is 4.22%. The absolute stereochemistry was determined with a Flank parameter of 0.05 (4). Structural diagrams in this paper were created using Mercury 4.2.0. 71   Referencing. The 13 C and 1 H chemical shifts were referenced with respect to tetramethylsilane (TMS) using L-alanine at natural abundance as the secondary reference. The CH3 group of L-alanine is referenced at 1.1 ppm for the 1 H methyl resonance and 177.8 ppm for the 13 C carboxylate resonance. This corresponds to adamantane at 1.85 ppm for 1 H 84 and 38.5 ppm for 13 C 85 . The 14 N shifts were referenced with respect to saturated NH4Cl aqueous solution using β-aspartyl-Lalanine at natural abundance, whereby the NH resonance is at −284 ppm at a 1 H Larmor frequency of 600 MHz, corresponding to liquid CH3NO2 at 0 ppm. 27,31,86 For equivalence to the chemical shift scale frequently used in protein 15 N NMR, where the alternative IUPAC reference (see Appendix 1 of ref 87 ) is liquid ammonia at 50 °C, it is necessary to add 379.5 ppm to the given values. 88 The 15 N chemical shifts are also referenced to liquid CH3NO2 at 0 ppm. For the 19 F spectrum, chemical shifts were referenced with respect to 50/50 v/v trifluoracetic acid/ water (relative to CCl3F 69 ). The accuracy of the experimental shifts is within ±0.2, ±0.1 and ±5 for 1 H and 19 F, 13 C and 15 N, and 14 N, respectively.

DFT Calculations
Density functional theory (DFT) calculations were conducted using CASTEP 89 version 19.1. For the full crystal, geometry optimisation with fixed unit cell parameters followed by magnetic shielding calculations to determine the NMR parameters were completed.
Distances stated in this paper are for the geometry optimised crystal structure. The Perdew Burke Ernzerhof (PBE) exchange correlation functional 90 , a plane-wave basis set with ultrasoft pseudopotentials and a plane-wave cut-off energy of 800 eV were implemented. A minimum Monkhorst-Pack grid spacing of 2π x 0.1 Å −1 was used. The GIPAW 56,57 method was used to calculate the NMR parameters: calculated isotropic chemical shifts were determined from the calculated chemical shieldings according to . It is noted that it is common practice to calculate a specific reference shielding for each system (see, e.g., Table S8 of ref. 65 ), though average values over a range of compounds are also available. 91 For 1 H and 13 C, different reference shieldings were used for high-and low ppm chemical shifts: 92 for 1 H, 30.5 ppm for > 6.5 ppm and 30.0 ppm for < 6.5 ppm; for 13

Single-Crystal X-ray Diffraction of Lorlatinib
The crystal structure of Lorlatinib has been solved at 100 K (see Table 1). (Note that a crystal structure of the HCl salt of Lorlatinib has been reported by Johnson et al., CSD ZOJLIV.) Figure 1 shows the packing of Lorlatinib along the a axis. The two distinct molecules in the asymmetric unit cell, i.e., Z' = 2 (see also Figure S1), are coloured in blue and red in Figure 1.

MAS NMR Spectroscopy with GIPAW Calculation
In the following sections, we demonstrate how an NMR crystallography analysis that combines solid-state MAS NMR experiment of powdered Lorlatinib with DFT calculation for the crystal structure of Lorlatinib enables, first, assignment of the NMR chemical shifts and, subsequently, insight into intermolecular interactions that affect solubility and bioavailability of a pharmaceutical. For moderately sized organic molecules such as Lorlatinib, key to this strategy is recording two-dimensional correlation MAS NMR spectra. 93 We show first (in section 3.2.1) how peaks at the 13  Taken altogether, this paper presents a complete assignment of the 1 H, 13  A concluding discussion (section 4) considers the structural insights that the NMR crystallography analysis offers for Lorlatinib, in particular concerning intermolecular hydrogen bonding that holds the individual molecules together in the solid-state structure, and is thus relevant for the key property for an orally delivered pharmaceutical, namely solubility and hence bioavailability.

Heteronuclear 1 H-13 C Two-Dimensional MAS NMR Spectra of Lorlatinib:
Assigning the 13   MAS NMR spectra of Lorlatinib, recorded using the two alternative strategies for achieving high resolution in the 1 H dimension. The spectra in the bottom row, Figure  3c and 3d, were recorded at a MAS frequency of 60 kHz with no 1 H homonuclear decoupling applied. Separate spectra were recorded for the aromatic (left, c) and aliphatic (right, d) 13 C resonances. In the top row, two zoomed-in regions corresponding to the same aromatic (left, a) and aliphatic (right, b) 13 C resonances are shown, as extracted from a 1 H- 13 C HETCOR MAS NMR spectrum recorded at a MAS frequency of 12.5 kHz, applying frequency-switched Lee-Goldburg (FSLG) 78  The spectra presented in Figure 3a and 3b have been processed such that the highest and lowest 1 H chemical shifts in ppm are the same as those with MAS alone in Figure   3c and 3d. A further difference is that the spectra are recorded at a 1 H Larmor frequency of 500 MHz (top row, Figure 3a and 3b) or 1 GHz (bottom row, Figure 3c and 3d). High magnetic field brings the benefit of both higher sensitivity and enhanced resolution, the latter provided that the linewidths are independent of magnetic field. In both cases, CP was employed to transfer magnetisation from 1 H to 13 C via 1 H- 13 C heteronuclear dipolar couplings. Note that at 60 kHz MAS, a low 13 C nutation frequency of 10 kHz was applied during CP such that the presented spectra had to be separately recorded for the high-ppm (aromatic) and low-ppm (aliphatic) regions, as presented in Figure 3c and 3d, respectively.  13 C spectra corresponding to the two distinct molecules in the asymmetric unit cell,

Z' = 2 (labelled as I and II). For the C25 methyl resonances, there is a large difference
for the 1 H chemical shifts in the two distinct molecules, namely 3.0 and 0.5 ppm. This is discussed further in section 4 below.
Red crosses in Figure 3 correspond to the GIPAW calculated 1 H and 13 C chemical shifts for the DFT (CASTEP) geometry-optimised crystal structure. The assigned 13 C and 1 H resonances as labelled in Figure 3 are listed in Table 2 13 C dipolar couplings that also determine the loss of signal due to T1 relaxation during the 1 H spin-lock pulse: hence different build-up behaviour is observed for quaternary and non-quaternary 13 C resonances, and also between CH, CH2 and CH3 resonances. Therefore, CP MAS spectra are not quantitative. Nevertheless, similar, but not identical, peak intensities are observed for similar chemical environments, notably for 13 C resonances corresponding to the same chemical shift in the two distinct molecules in the asymmetric unit cell. Figure S5 presents integrated intensities for the resolved centre-band resonances. CP (500 s) -HETCOR NMR spectra presented in Figure 3c and 3d. (c) The stick spectrum represents the GIPAW calculated 13 C chemical shifts for the two distinct molecules (blue I and red II ) in the asymmetric unit cell (see also Table 2).
In Figure 4, the CP MAS spectrum in Figure 4a is compared to Figure 4b that presents skyline projections from the 1 H (1 GHz) - 13 C HETCOR MAS (60 kHz) CP-HETCOR NMR spectra presented in Figure 3c and 3d that were recorded with a shorter CP contact time of 500 s. Additional 13 C resonances are observed in Figure   4a for the non-protonated quaternary carbon atoms. In addition, Figure 4c presents stick spectra that correspond to the GIPAW calculated chemical shifts. Note that the blue (I) and red (II) colouring for the stick spectra corresponding to the two distinct Lorlatinib molecules in the asymmetric unit cell. The non-protonated 13 C resonances are assigned in the same order as the GIPAW calculated values, though, as noted in Table 2, there is ambiguity concerning the assignment of the C4 and C12 and the C5 and C20 pairs of resonances.

Heteronuclear 1 H-15 N and 14 N-1 H MAS NMR Spectra of Lorlatinib: 14/15 N Chemical Shifts and 14 N Quadrupolar Parameters
A 1 H-15 N CP MAS NMR spectrum of Lorlatinib is presented in Figure 5. Even though, compared to 13 C, 15 N has a lower natural abundance (0.4% compared to     [26][27][28]98 It is informative to view the three 1 H-based two-dimensional spectra in a column view with a common (horizontal) 1 H axis. Such a view is presented for Lorlatinib in Figure   6, with a 1 H-1 H DQ-SQ MAS NMR spectrum at the top (Figure 6a), a 14 N-1 H HMQC MAS NMR spectrum in the middle (Figure 6b) and a 1 H-13 C CP-HETCOR spectrum at the bottom (Figure 6c). The latter corresponds to the same spectrum as was presented in Figures 3a and 3b, but the view is rotated through 90 degrees so that the three spectra have the same horizontal 1 H axis (note that this then means that the 13   clockwise through 90⁰ such that the 13 C axis is from high to low ppm downwards).
The base contour levels are at (a) 9%, (b) 51% and (c) 18% of the maximum peak height, respectively.
In this section, we focus on the 14 Figure 6b. This is different to the case of dipolar couplings and chemical shift anisotropy that are of the order of kHz. As such, second-order perturbation theory must be applied in order to explain observed 14 N solid-state NMR spectra. There are two key manifestations of this: first, fourth-rank second-order quadrupolar broadening of the 14 N lineshapes that is not removed by MAS (contrast this with second-rank broadening due to dipolar couplings and chemical shift anisotropy and also first-order quadrupolar broadening that are all removed by MAS); second, an isotropic second-order quadrupolar shift 27 that is additional to the isotropic chemical shift, the latter is assumed to be the same for 15 N and 14 N. As shown in Table 3, the isotropic second-order quadrupolar shifts are large, namely 454 and 470 ppm, hence explaining the substantial changes in the 14 N shifts as compared to the 15 N isotropic chemical shifts.  Figure 5.
b δiso = σref − σiso , where σref = −160 ppm c Centre of gravity of the 14 N peaks extracted from the 14 N-1 H HMQC spectrum seen in Figure 6b. Here, the error is estimated to be within ±5 ppm  Table 2).

H-H proximities in Lorlatinib: 1 H (DQ) -1 H (SQ) 2D MAS NMR
A 1 H-1 H DQ-SQ MAS NMR experiment, as shown in Figure 6a, recorded with one rotor period of BaBa recoupling, 18,19 provides information on proximities between protons that are close together in space. 16,93 This spectrum is repeated in Figure 8,  and corresponding DQ frequency, sorted by the numerical proton identifier can be found in the supplementary information (Table S2). As indicated by the vertical dashed lines in Figure 6a and 8, the spreading out into a second dimension in the 1 Figure 3 with the GIPAW calculated chemical shifts. As noted in Figure   3,

CH- Interactions in Lorlatinib
A key focus of the application of NMR crystallography to the solid-state structure of Lorlatinib in this paper is on the key intermolecular hydrogen bonds formed by the NH2 groups. Table 5 lists the N…N or N…O as well as H…N or H…O hydrogen bonding distances as well as the hydrogen-bonding angles for the hydrogen bonds formed by three of the four NH2 hydrogen atoms (see Figure 2). Remember that, while there is only one NH2 group in each Lorlatinib molecule, there are two distinct molecules in the asymmetric unit cell.  Table 3, the experimental quadrupolar product, PQ, is the same, 3.4 MHz, with GIPAW calculation giving 3.3 and 3.6 MHz.  Figure 2) An NMR crystallography analysis can also provide insight into CH- interactions due to ring current interactions. As observed in the 1 H- 13 C CP-HETCOR spectra presented in Figure 3 and discussed above in section 3.2.1, the 1 H chemical shifts for the methyl hydrogens attached to C25 in the two distinct molecules in the asymmetric unit cell are significantly different, at 3.0 and 0.5 ppm. The much lower 1 H chemical shift of 0.5 ppm is explained by this methyl group pointing into the pyridine ring as shown in Figure 9.