Four-dimensional NOE-NOE spectroscopy of SARS-CoV-2 Main Protease to facilitate resonance assignment and structural analysis
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA
Correspondence: Ad Bax (firstname.lastname@example.org)
Resonance assignment and structural studies of larger proteins by nuclear magnetic resonance (NMR) can be challenging when exchange broadening, multiple stable conformations, and 1H back-exchange of the fully deuterated chain pose problems. These difficulties arise for the SARS-CoV-2 Main Protease, a homodimer of 2 × 306 residues. We demonstrate that the combination of four-dimensional (4D) TROSY-NOESY-TROSY spectroscopy and 4D NOESY-NOESY-TROSY spectroscopy provides an effective tool for delineating the 1H–1H dipolar relaxation network. In combination with detailed structural information obtained from prior X-ray crystallography work, such data are particularly useful for extending and validating resonance assignments as well as for probing structural features.
The extension of conventional two-dimensional 1H–1H NMR spectroscopy of natural proteins (Wüthrich, 1986) to three-dimensional (3D) homonuclear NMR experiments offered the ability to simplify spectral analysis by removing resonance overlap (Vuister et al., 1988; Oschkinat et al., 1988) and by providing access to a direct, more detailed analysis of 1H–1H dipolar cross-relaxation networks. In particular, the homonuclear 3D NOE-NOE experiment (Boelens et al., 1989; Breg et al., 1990) not only decreased resonance overlap, it also directly elucidated spin-diffusion pathways. This information complemented and validated the elegant relaxation matrix analysis of spin diffusion (Boelens et al., 1988).
Such homonuclear 1H 3D experiments and analysis strategies were soon followed by a myriad of heteronuclear 3D experiments that required isotopic enrichment and therefore cloning and bacterial overexpression (Marion et al., 1989b; Zuiderweg and Fesik, 1989; Ikura et al., 1990; Marion et al., 1989a; Wagner, 1993). Most of these heteronuclear experiments simply served to disperse the regular 1H–1H 2D spectrum into a third dimension, thereby removing spectral overlap but providing little or no new information on the all-important 1H–1H spin-diffusion pathways. The 3D NOESY-HMQC experiment (Marion et al., 1989b; Zuiderweg and Fesik, 1989) subsequently was extended to four dimensions (4D), thereby dispersing the conventional 2D 1H–1H NOESY experiment into two additional dimensions that correspond to the chemical shifts of the nuclei to which each of the protons is covalently bound (Kay et al., 1990; Clore et al., 1991; Zuiderweg et al., 1991).
These multi-dimensional experiments provided a tremendous degree of spectral simplification, in particular after appropriate analysis software became available. However, it also quickly became clear that extension to large, slowly tumbling proteins was hampered by low signal to noise, caused by the relative inefficiency of the magnetization transfer steps when the dimensionality of a spectrum is increased. This decrease in sensitivity was remedied by generating the protein in a highly perdeuterated state while keeping the solvent-exchangeable backbone amide protons protonated (Torchia et al., 1988; Lemaster and Richards, 1988). Combining the perdeuteration approach with both the triple-resonance assignment strategy (Grzesiek et al., 1993) and the subsequently introduced powerful TROSY line-narrowing method (Pervushin et al., 1997) made it possible to assign and analyze the structure of quite large proteins, as exemplified by the 723-residue protein malate synthase G (Tugarinov et al., 2002, 2005a). The sensitivity gained by perdeuteration, enabling the recording of 4D 15N-separated NOE spectra, also was key in solving the structure of a HIV-1 accessory protein that had been too challenging for analysis by more conventional methods (Grzesiek et al., 1995). Zhu and co-workers introduced a TROSY-NOESY-TROSY version of Grzesiek's 4D NOESY experiment which, illustrated for a partially deuterated 27 kDa protein, yielded further improved sensitivity and intrinsic 1H and 15N line widths (Xia et al., 2000). Their implementation relied on eight-step phase cycling, thereby limiting digitization of the time-domain data and unable to exploit the improved TROSY relaxation in the indirect dimensions. Diercks et al. (2010) introduced an elegant method to suppress diagonal signals from a 4D TROSY-NOESY-TROSY spectrum (Diercks et al., 2010), a particularly useful feature when spectral resolution is limited. However, we opted not to use this implementation as, at the long mixing times used, the diagonal resonances serve as convenient reference anchors during analysis and are sufficiently attenuated such as not to obscure nearby peaks in our 4D spectra that are of very high digital resolution.
In the present report, we merge the above-mentioned prior advances, 3D NOE-NOE and 3D 15N-separated NOESY, into a 4D experiment. In combination with extensive perdeuteration and gradient-enhanced encoding to enable a four-step phase cycle as well as non-uniform sampling (NUS) (Rovnyak et al., 2004), the experiments take better advantage of the improved resolution afforded by 4D NMR. We demonstrate the utility of the experiments by applying them to the study of the main protease of SARS-CoV-2 (Mpro), which is the virus responsible for coronavirus-2019 disease (COVID-19). Mpro, also known as 3CLpro or Nsp5, is a homodimeric cysteine protease of 2 × 306 residues that does not have closely related mammalian homologues and is therefore an intense target for drug development, with a promising inhibitor now entered in a phase I clinical trial (Boras et al., 2021). Its NMR analysis is challenging, not only for its large size (67.6 kD), but also because of the presence of a minor conformer associated with the cis-isomer of one of its 13 Pro residues (P184), the difficulty in back-exchanging all backbone amide protons when the protein is expressed in 2H2O, and the presence of intermediate timescale motions that lead to exchange broadening in the vicinity of the protein's active site. Here we focus on an enzyme variant where the catalytic Cys145 residue has been mutated to Ala (M), a construct that is stable for multiple weeks at the high concentrations required for NMR spectroscopy. The assignment process and a full structural analysis of the protein will be presented elsewhere. The focus of the present work is on technical innovations, including recording two types of 4D NOE-based NMR spectra that proved invaluable both for the validation of resonance assignments as well as the subsequent structural analysis.
2.1 Protein production
The gene encoding a C145A variant of Mpro (M) with an N-terminal affinity (and solubility) tag was synthesized by GenScript (USA) and then cloned into a Pet24a+ plasmid between BamH1 and Xho1 restriction sites. The fusion protein encoded for 6His tag – GB1–SG rich linker–TEV cleavage site – M and was purified according to methods collectively developed by the COVID-19 NMR consortium (Altincekic, 2021). In brief, following cell culture and harvesting (see Supplement), the cell lysate was passed down a 6His-affinity column (IMAC) and eluted in a small volume; the solubility tag was cleaved off to generate a native N-terminus; the reaction mix was then passed through an IMAC column to remove uncleaved protein before size separation on a Sephadex G75 column. The resulting, extensively perdeuterated, 2H,15N-M homodimer was used for experiments and, throughout the recombinant protein expression, extensive care was taken to achieve a high (∼ 98 %) level of deuteration of the non-exchangeable hydrogens. For more details, see the Supplement.
2.2 Recording of NMR data
Spectra were acquired on a sample containing 1.8 mM (0.9 mM dimer) 2H, 15N-M in 10 mM sodium phosphate, pH 7.0, 0.5 mM TCEP, 3 % 2H2O and 0.3 mM sodium trimethylsilylpropanesulfonate (DSS; as an internal chemical shift reference), in a 300 µL Shigemi microcell. All experiments were recorded at 25 ∘C on an 800 MHz Bruker Avance III spectrometer, equipped with a 5 mm TCI probe containing a triple-axis gradient accessory, and running TopSpin software version 3.1.
Considering that the measurements are carried out for perdeuterated protein, the spectral windows in the indirect 1H dimensions were limited to the region downfield from the water resonance, and EBURP2 pulses (Geen and Freeman, 1991) were used where needed to prevent excitation of protons outside of this region. For the 4D TROSY-NOESY-TROSY experiment (Fig. 1a), only a small fraction (0.54 %) of the full data matrix, consisting of 1536* (1H, t4, 119.8 ms) × 90* (15N, t3, 35.1 ms) × 91* (1H, t2, 20.0 ms) × 90* (15N, t1, 35.1 ms) complex points, was recorded (Table S1 in the Supplement) using nonuniform sampling for the indirect dimensions. Selection of the sampled data points was unweighted, randomly distributed, and without time ordering. The sampling schedule (included in the uploaded data sets; see Data availability) was automatically generated by Topspin 3.1 when the experiment started, with the random seed left to its default value (54 321) and the T2 for each indirect dimension set to 1 s to avoid significant weighting of the sampling schedule. Specifically, a total of 31896 FIDs along the directly detected t4 dimension were recorded. Using four scans per FID for phase cycling and an interscan delay of 2.07 s, the total measurement time was approximately 88 h, but 3-fold shorter would have sufficed (see Concluding remarks).
Nonstandard processing was needed for the TROSY-NOESY-TROSY experiment because the spectrum was recorded with sensitivity-enhanced gradient selection in the 15N t1 evolution period that preceded the NOE mixing (Xia et al., 2000). Specifically, the 4D NUS data set was first sorted and expanded according to the sampling schedule using the nusExpand.tcl script within the NMRPipe software package (Delaglio et al., 1995). The expanded data were then converted to the NMRPipe format, with the quadrature mode for t3 set to Echo-AntiEcho, while the quadrature mode for t1 was temporarily set to Complex. After the conversion, the 4D matrix needs to be transposed to enable use of the NMRPipe macro bruk_ranceA.M to correctly reshuffle the data, turning the phase-modulated t1 dimension into conventional amplitude-modulated data prior to processing as regular, complex data. This transposition is accomplished by reading in the NMRPipe-formatted matrix with the z axis along the t2 dimension, application of the macro, and restoring the data to its original axis order prior to regular processing, with the full script included with the raw time-domain data sets (see Data availability). For the processing, the direct dimension was apodized with a squared, shifted sine-bell window, spanning from 72 to 176.4∘, whereas an additional 15 Hz exponential line broadening was used to better match the apodization window to the natural decay of the signal, thereby improving the signal-to-noise ratio (). This was followed by zero filling and Fourier transformation. Subsequently, the indirect data points that were not experimentally sampled were reconstructed using the SMILE (Ying et al., 2017) program, and the reconstructed data were further processed in NMRPipe. To enhance the spectral resolution, by default the acquisition times in all indirect dimensions were extended by 50 % during the SMILE reconstruction, leading to an effective sampling sparsity of 0.16 %. The data matrix for the final reconstructed 4D spectrum consists of 614 (1H, F4, 6.3 Hz per point) × 512 (15N, F3, 5.0 Hz per point) × 512 (1H, F2, 8.9 Hz per point) × 512 (15N, F1, 5.0 Hz per point) real points (see Table S1).
The full time-domain data matrix of the 4D NOESY-NOESY-TROSY experiment (Fig. 1b) consists of 1536* (1H, t4, 95.8 ms) × 90* (15N, t3, 35.1 ms) × 60* (1H, t2, 12.0 ms) × 60* (1H, t1, 12.0 ms) complex points (Table S1). An unweighted, random NUS sampling scheme with a sparsity of 1.69 % (corresponding to 43 856 t4 FIDs) was used to record a small subset of the data points. Using an interscan delay of 1.77 s and four scans per FID, the total experimental time was approximately 110 h, but 3-fold shorter would have sufficed (see Concluding remarks). The data were processed and reconstructed in the same manner as described above with 50 % extension of all indirect dimensions during the SMILE reconstruction, resulting in an effective sparsity of 0.50 % and a final spectral matrix size of 492 (1H, F4, 7.8 Hz per point) × 512 (15N, F3, 4.7 Hz per point) × 512 (1H, F2, 13.9 Hz per point) × 512 (1H, F1, 13.9 Hz per point) real points (Table S1). Note that since in the NOESY-NOESY-TROSY experiment the data were recorded using the Echo-AntiEcho mode (Kay et al., 1992) only in the t3 dimension, immediately preceding acquisition, the bruk_ranceA.M macro was not needed after the conversion of the expanded NUS data. The residual in-phase axial peaks along the F2 dimension were treated as real peaks and optimally reconstructed by SMILE to suppress the sampling artifacts of the axial signals from spreading to the regions with NOE peaks. The processing macros used for both 4D spectra are included with the raw time-domain data (see Data availability).
2.3 Spectrum analysis
Spectra were processed using NMRPipe software (Delaglio et al., 1995); peak picking and spectrum analysis was performed using SPARKY software (Goddard and Kneller, 2008; Lee et al., 2015) as well as NMRDraw (Delaglio et al., 1995). Programs for visualization and analysis were written using freely available python libraries (Hunter, 2007; Harris et al., 2020) as well as NMR-specific python libraries (Helmus and Jaroniec, 2013).
Two types of complementary 4D NOE experiments were recorded: (1) 4D TROSY-NOESY-TROSY and (2) 4D NOESY-NOESY-TROSY (Fig. 1). While the former is very similar to the HMQC-NOESY-TROSY experiment used recently for a single α-helical domain with a long rotational correlation time (Barnes et al., 2019), the 4D NOESY-NOESY-TROSY experiment extends earlier work by Kaptein and co-workers (Boelens et al., 1989; Breg et al., 1990).
3.1 Recording and analysis of the 4D TROSY-NOESY-TROSY spectrum
The rotational correlation time of the C145A variant of Mpro (M) at 25 ∘C is ca. 27 ns, and consequently, transverse relaxation is rapid for both 15N and 1HN nuclei. For this reason, it proved beneficial to substitute a TROSY element for the HMQC or HSQC segment that was previously used for such measurements (Kay et al., 1990; Barnes et al., 2019). Even though the TROSY element only utilizes half of the amide 1HN magnetization present at the start of the pulse sequence, combining its 15N evolution with sensitivity-enhanced gradient selection during the subsequent t2 evolution period (Fig. 1a) limits the loss to or even somewhat less when taking the gain from the 15N Boltzmann magnetization into account (Pervushin et al., 1998). A 2D TROSY spectrum (Fig. S1) of this sample allowed identification of 261 backbone amide peaks out of 293 non-proline residues, suggesting the feasibility of implementing the TROSY version of the 4D NOESY experiment. Conformational exchange on a timescale that results in extensive line broadening and incomplete back-exchange of amides when the protein was purified in 1H2O are the primary causes of the absence of the ca. 30 amide signals.
The high quality and of the TROSY-HSQC spectrum (Fig. 2b) suggested the feasibility of implementing the TROSY version of the 4D NOESY experiment. Combined with the enhanced relaxation properties during t1 and t2 evolution of the TROSY-selected coherence, we found experimentally that spectral quality attainable for M with the 4D TROSY-NOESY-TROSY was better than with the HMQC-NOESY-TROSY version of the experiment, consistent with the previous report that the TROSY implementation improved both the sensitivity and resolution over the 4D HSQC-NOESY-HSQC (Xia et al., 2000). Figure 2 shows expanded regions of six (F1, F2) cross sections through the 4D spectrum, each orthogonal to the (F3, F4) frequencies of the six amide correlations that are highlighted in a section of the regular 2D 1H–15N TROSY-HSQC spectrum of Fig. 2b. A total of 231 peaks, out of the 261 peaks in the 2D TROSY spectrum, can be detected as (semi-)resolvable diagonal peaks in the projected 15N–1H (F3,F4) plane (data not shown). These numbers do not include the doubling of resonances associated with isomerization of P184.
The cross sections exemplify the power of 4D analysis for three types of secondary structure: α-helix (Fig. 2c), β-sheet (Fig. 2d), and a loop region (Fig. 2e). Due to the long NOE mixing time used in this experiment (200 ms), substantial spin diffusion occurs, which results in numerous NOE correlations for each amide. For example, α-helical residues L232 and M235 not only show NOE interactions with one another, but also share NOE cross peaks to V233 and A234, with M235 even showing a weak cross peak to N231. Such correlations are particularly useful for validating the assignments obtained from the limited number of triple-resonance backbone assignment experiments that are applicable to larger proteins such as Mpro.
The amides of L67 and Q69 in strand β4 only share a single NOE, to sequential residue V68, but they show valuable long-range NOEs to amide protons in strands β1 (C22) and β5 (L75). G195 and D197, located in the long loop that connects strand β13 to helix α6, have an NOE to one another as well as sequential NOEs but show no long-range interactions, consistent with the X-ray structure (Douangamath et al., 2020). However, NOEs from L67 or Q69 to T21 or Q19 are not observed, despite close proximity, due to the minimal back-exchange of amide protons in the β1 strand.
It is interesting to compare the diagonal peak intensities in these various cross sections of the TROSY-NOESY-TROSY spectrum. Diagonal intensity is a function of the amount of amide 1H z magnetization present at the start of the pulse sequence, i.e., it depends on the non-selective longitudinal relaxation time of the amide proton, but also on the attenuation of this magnetization during the NOE mixing time, in other words, on the selective longitudinal relaxation time which is dominated by J(0) spectral density terms. The latter dominate the differences in diagonal intensity seen in the various cross sections. For example, the helical amides of L232 and M235 rapidly lose their magnetization to their proximate sequential amide neighbors, separated by ca. 2.7 Å, that each are in close contact with other neighboring protons. By contrast, none of the L67, Q69, G195 and D197 amides are closer than 3.7 Å from any neighboring protonated amide in the 1.25 Å X-ray structure of Mpro (Douangamath et al., 2020), causing their diagonal intensities to remain high.
3.2 Recording and analysis of the 4D NOESY-NOESY-TROSY spectrum
As highlighted by the work of Kaptein and co-workers, 3D NOE-NOE experiments provided an effective method for studying the 1H–1H cross-relaxation network in proteins in more detail. Here, we extend this powerful experiment to four dimensions, making it more straightforward to analyze such a spectrum while limiting the relaxation pathways by perdeuteration of the protein.
The pulse scheme of this 4D NOESY-NOESY-TROSY is shown in Fig. 1b. It represents a straightforward extension of the original NOE-NOE 3D experiment (Boelens et al., 1989) but with the detection period substituted by the gradient-enhanced 2D 1H–15N TROSY scheme (Pervushin et al., 1998). The latter enhances the attainable spectral resolution in the t3 and t4 dimensions, while dispersing the detected 1HN resonances in the 15N dimension. A number of minor technical considerations are also relevant in this respect. (1) First, in order to maximize the number of (t1, t2, t3) data points sampled, the phase cycling of the 4D experiment was reduced to four steps, and the observed spectral window was restricted to the region downfield of the H2O resonance. To prevent bleeding in of several weaker imperfectly deuterated aliphatic or exchangeable resonances present in the upfield spectral region, selective-EBURP2 and reverse-EBURP2 pulses (Geen and Freeman, 1991) were used to also restrict the regions where 1H resonances were excited to those resonating downfield from the water resonance. As a result, no NOE peaks from a few amide protons resonating near water or upfield from water were observed. (2) Recording of a 4D NMR spectrum at adequate resolution requires the use of non-uniform sampling. High quality NUS reconstruction of a 4D NMR spectrum can be accomplished by the SMILE program (Ying et al., 2017) but this as well as most other NUS reconstruction software performs better if the various time domains are acquired in a manner that results in either a 0∘ or a 180∘ linear phase correction across the spectrum. For this purpose, and to ensure that the non-suppressed axial peaks can be optimally reconstructed, which requires 0∘ linear phase correction, it was preferable to insert a non-selective 90∘x207∘y90∘x composite 1H inversion pulse (highlighted as the green open bar in Fig. 1b), followed by a second such pulse that reverses any phase imperfections introduced by the first composite pulse (Hwang et al., 1997). Specifically, the φ1 phase cycling serves to eliminate axial peaks in the t1 dimension caused by pulse imperfection as well as T1 relaxation and amide exchange with solvent during Tm1, while also suppressing axial peaks in the t2 dimension resulting from T1 relaxation and water exchange during Tm2. To minimize the number of phase cycling steps, φ2 was not phase cycled. However, this resulted in small residual axial peaks along the F2 dimension caused by pulse imperfections. To ensure that these residual axial peaks were absorptive in the final spectrum, thereby simplifying SMILE NUS reconstruction, an echo is generated by the application of two composite 1H 180∘ pulses in order to suppress initial chemical shift evolution at t2=0, thereby eliminating the need for a linear phase correction. Considering that the real and imaginary components of the residual axial signals have the same amplitude, they result in a 45∘ phase error for the axial peaks in the F2 dimension. Shifting the φ2 phase by −45∘ ensures that the NOE and axial peaks both can be phased absorptive using the same phase correction, thus facilitating NUS processing.
Compared to the 4D TROSY-NOESY-TROSY pulse scheme, the 4D NOESY-NOESY-TROSY experiment avoids the lossy magnetization transfer step from 1H to 15N and back (leading to a slightly larger number of 241 diagonal peaks on the 15N–1H (F3, F4) projected plane, compared to 231 for TROSY-NOESY-TROSY). Instead, its magnetization is simply transferred, in part, to its nearest neighbors by cross-relaxation during the first NOE mixing period of duration Tm1=50 ms. There is virtually no loss in total spin polarization summed over the initial “starting spin”, whose t1 evolution is monitored, and those of its immediate neighbors that are within cross-relaxation contact. As a result, the intrinsic sensitivity of such NOESY-NOESY-TROSY measurements is quite high, allowing the choice of a long duration of 300 ms for the second NOE mixing time, Tm2. During this second, much longer mixing time, the z magnetization distributes over considerable distances due to indirect transfers (Fig. 3). Even in this extensively perdeuterated protein, NOEs to nearly a dozen neighboring protons are observed on the diagonals of the (F1, F2) cross sections, taken at the same (15N, 1H) frequencies used for illustrating the utility of the 4D TROSY-NOESY-TROSY spectrum of Fig. 2. However, as pointed out by Boelens et al. (1989) and Breg et al. (1990), the NOE-NOE combination offers a wealth of new information on the cross-relaxation pathways that led to the long-distance NOEs, substantially aiding both the assignment and analysis of distance information. Below, we briefly highlight a few examples.
As expected, α-helical residue L232 shows intense cross peaks to both of its sequential neighbors, N231 and V233, as well as a weaker cross peak to F230. Despite the relatively short mixing time of only 50 ms that separates t1 and t2 evolution, the latter must result mostly from indirect transfer through N231, because N231 and F230 share an intense cross peak. So in effect, each cross section through the 4D spectrum shown in Fig. 3 corresponds to a 2D NOESY spectrum of a small, localized, region within the protein structure – making its analysis far simpler. For residues with few neighbors, direct NOE contacts between neighbors separated by as much as 4.5 Å give rise to quite intense cross peaks after 50 ms NOE mixing, as exemplified by the contacts between G195 and its A194 and T196 neighbors (Fig. 3g). A weaker cross peak between G195 and D197, at an interproton distance of 6.4 Å, appears not to be mediated by spin diffusion because the G195 and D197 panels (Fig. 3g) show no common strong NOE to any visible resonance. However, the possibility that the hydroxyl proton of T196 serves as a relay partner cannot be excluded.
The NOESY-NOESY-TROSY spectrum also shows multiple NOEs to sidechain amide protons that are not visible in the TROSY-NOESY-TROSY spectrum because the TROSY element does not select magnetization transfer for NH2 groups. For example, D197 shows long-range NOEs to the N133 carboxamide protons, whereas Q69 shows NOEs to both its own carboxamide protons and to those of Q74. The non-equivalent NH2 pairs are readily recognized by cross peak to diagonal peak intensity ratios that are close to one, owing to their short interproton distance.
The spectra shown in this study were recorded during the summer of 2020, when access to campus facilities was strongly restricted due to COVID-19 pandemic mitigation efforts. These restrictions allowed for much lengthier acquisition of spectra than commonly used, for a total of 8 d for the two 4D spectra. As a benefit of NUS reconstruction, it is possible to generate spectra of the same resolution recorded in any fraction of that time. Alternatively, we can discard the data recorded at the longest values of t1, t2, and t3. Indeed, processing the same time-domain data sets but shortening the time domains using a previously described protocol that considers the total normalized length of the 3D (t1, t2, t3) time-domain vector (Ying et al., 2019), using only one-third of the acquired time-domain data yields spectra that are very similar to the ones shown in Figs. 2 and 3, albeit at slightly lower resolution and signal to noise, due to the use of 3-fold less time-domain data. Nevertheless, the quality of the resulting spectra remains excellent, with near-identical information content (Figs. S2 and S3 in the Supplement).
Use of the lengthy data acquisition times needed to collect the 4D spectra requires a high stability sample, which in our case benefited from the C145A active site mutation, protecting the sample from auto-proteolysis. As with all NMR experiments, is approximately proportional to sample concentration. Therefore, working at high concentrations benefits of these experiments that involve multiple magnetization transfer steps, an issue that is particularly important for NOE experiments where magnetization from a single nucleus is distributed over many neighbors.
We note that the TROSY-NOESY-TROSY experiment used a long NOE mixing time of 200 ms, such as to increase the number of observed connectivities by adding indirect NOE effects, including spin diffusion through hydroxyl protons (Koharudin et al., 2003), thereby aiding the assignment process. The use of a 50 ms NOE mixing period in the subsequent 4D NOESY-NOESY-TROSY experiment then provided a semi-quantitative measure of distance between these protons and their neighbors. Indeed, as pointed out by Kaptein and co-workers, recording of NOE-NOE spectra provides important experimental data on the pathway of magnetization transfer during NOE mixing. Such information could be used to convert these data into more quantitative distance information than the typical qualitative analysis of NOE intensities, potentially leading to the generation of higher resolution structures (Vogeli et al., 2009, 2012). Quantitative NOE interpretation traditionally relied on the recording of a series of NOE buildup data, which can become comparably time-consuming as the recording of 4D NMR spectra if resonance overlap is a limiting factor, as typically is the case for NOE spectra. This problem is further exacerbated by the spectral crowding of large proteins, particularly in the 1H dimension, and while 3D spectra may give higher signal-to-noise ratios than 4D spectra, downstream analysis frequently requires extensive disambiguation of overlapped peaks. Our study of M shows that a large number of semi-quantitative NOE distances become accessible by recording of 4D NMR spectra on a perdeuterated larger protein with little or no ambiguity about the nuclei involved.
While the high signal to noise and spectral simplicity of working with perdeuterated proteins has long been recognized (Torchia et al., 1988; Lemaster and Richards, 1988; Grzesiek et al., 1993; Tugarinov et al., 2004) the number of structural restraints accessible used to be small. Our present study demonstrates that a much larger number of NOE interactions becomes available by the recording of 4D NOE spectra. Moreover, it highlights the exquisite detail and value of NOE-NOE interaction analysis explored by the Kaptein group and it demonstrates that this approach is highly suitable for the larger biomolecules and biomolecular complexes being explored today, in particular when using extensive perdeuteration. Therefore, we believe that the recording of high quality 4D NMR spectra of the type presented in this study is entirely practical and invaluable for the structural and functional analysis of large proteins and their complexes, with possible extension to the study of nucleic acids. We note, however, that in the absence of extensive deuteration the dilution of nuclear magnetization over sidechain resonances will strongly lower the sensitivity of the experiment, which is further exacerbated by decreased effectiveness of TROSY-based line narrowing in such samples. On the other hand, adaptations of the NOESY-NOESY-TROSY experiment to methyl-protonated but otherwise perdeuterated proteins (Tugarinov et al., 2005b) are expected to be readily feasible.
The raw Bruker NMR data sets including the acquisition parameters and NUS sampling lists, pulse programs, include file, and NMRPipe processing scripts are available for download from Zenodo: https://zenodo.org/record/4625615 (Robertson et al., 2021).
The supplement related to this article is available online at: https://doi.org/10.5194/mr-2-129-2021-supplement.
AJR expressed and purified protein samples, collected and analyzed the data, and edited the manuscript; JY optimized pulse sequence parameterization and processing and edited the manuscript; AB supervised the project and wrote the manuscript.
The authors declare that they have no conflict of interest.
This article is part of the special issue “Robert Kaptein Festschrift”. It is not associated with a conference.
We thank John M. Louis, Joseph Courtney, Yang Shen, James L. Baber and Dennis A. Torchia for helpful discussions. This work was supported by the Intramural Research Program of the NIDDK and by the Intramural Antiviral Target Program of the Office of the Director, NIH.
This study is dedicated to Robert Kaptein on the occasion of his 80th birthday.
This research has been supported by the National Institute of Diabetes and Digestive and Kidney Diseases (grant nos. DK075023 and DK029046).
This paper was edited by Isabella Felli and reviewed by four anonymous referees.
Altincekic, N.: Large-scale recombinant production of the SARS-CoV-2 proteome for high-throughput and structural biology applications, Frontiers in Molecular Biosciences, accepted, available at: https://covid19-nmr.de/publications/large-scale- recombinant-production-of-the-sars-cov-2-proteome-for-high-throughput-and-structural-biology-applications/, last access: 12 April 2021.
Barnes, C. A., Shen, Y., Ying, J. F., Takagi, Y., Torchia, D. A., Sellers, J. R., and Bax, A.: Remarkable Rigidity of the Single alpha-Helical Domain of Myosin-VI As Revealed by NMR Spectroscopy, J. Am. Chem. Soc., 141, 9004–9017, https://doi.org/10.1021/jacs.9b03116, 2019.
Boelens, R., Koning, T. M. G., and Kaptein, R.: Determination of Biomolecular Structures From Proton-Proton Noes Using a Relaxation Matrix Approach, J. Mol. Struct., 173, 299–311, 1988.
Boelens, R., Vuister, G. W., Koning, T. M. G., and Kaptein, R.: Observation of spin diffusion in biomolecules by 3-dimensional NOE-NOE spectroscopy, J. Am. Chem. Soc., 111, 8525–8526, https://doi.org/10.1021/ja00204a039, 1989.
Boras, B., Jones, R. M., Anson, B. J., Arenson, D., Aschenbrenner, L., Bakowski, M. A., Beutler, N., Binder, J., Chen, E., Eng, H., Hammond, H., Hammond, J., Haupt, R. E., Hoffman, R., Kadar, E. P., Kania, R., Kimoto, E., Kirkpatrick, M. G., Lanyon, L., Lendy, E. K., Lillis, J. R., Logue, J., Luthra, S. A., Ma, C., Mason, S. W., McGrath, M. E., Noell, S., Obach, R. S., O'Brien, M. N., O'Connor, R., Ogilvie, K., Owen, D., Pettersson, M., Reese, M. R., Rogers, T. F., Rossulek, M. I., Sathish, J. G., Shirai, N., Steppan, C., Ticehurst, M., Updyke, L. W., Weston, S., Zhu, Y., Wang, J., Chatterjee, A. K., Mesecar, A. D., Frieman, M. B., Anderson, A. S., and Allerton, C.: Discovery of a Novel Inhibitor of Coronavirus 3CL Protease for the Potential Treatment of COVID-19, bioRxiv, 2020.2009.2012.293498, https://doi.org/10.1101/2020.09.12.293498, 2021.
Breg, J. N., Boelens, R., Vuister, G. W., and Kaptein, R.: 3D NOE-NOE spectroscopy of proteins – Observation of sequential 3D NOE cross peaks in Arc repressor, J. Magn. Reson., 87, 646–651, https://doi.org/10.1016/0022-2364(90)90324-3, 1990.
Clore, G. M., Kay, L. E., Bax, A., and Gronenborn, A. M.: Four-dimensional 13C/13C-edited nuclear Overhausesr enhancement spectroscopy of a protein in solution: Application to interleukin 1β, Biochemistry, 30, 12–18, 1991.
Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A.: NMRpipe - a multidimensional spectral processing system based on Unix pipes, J. Biomol. NMR, 6, 277–293, 1995.
Diercks, T., Truffault, V., Coles, M., and Millett, O.: Diagonal-Free 3D/4D HN,HN-TROSY-NOESY-TROSY, J. Am. Chem. Soc., 132, 2138–2139, https://doi.org/10.1021/ja910523q, 2010.
Douangamath, A., Fearon, D., Gehrtz, P., Krojer, T., Lukacik, P., Owen, C. D., Resnick, E., Strain-Damerell, C., Aimon, A., Abranyi-Balogh, P., Brandao-Neto, J., Carbery, A., Davison, G., Dias, A., Downes, T. D., Dunnett, L., Fairhead, M., Firth, J. D., Jones, S. P., Keeley, A., Keseru, G. M., Klein, H. F., Martin, M. P., Noble, M. E. M., O'Brien, P., Powell, A., Reddi, R. N., Skyner, R., Snee, M., Waring, M. J., Wild, C., London, N., von Delft, F., and Walsh, M. A.: Crystallographic and electrophilic fragment screening of the SARS-CoV-2 main protease, Nat. Commun., 11, 5047, https://doi.org/10.1038/s41467-020-18709-w, 2020.
Geen, H. and Freeman, R.: Band-selective radiofrequency pulses, J. Magn. Reson., 93, 93–141, 1991.
Goddard, T. D. and Kneller D. G.: Sparky 3, University of California, San Francisco, USA, 2008.
Grzesiek, S. and Bax, A.: The Importance of Not Saturating H2O in Protein NMR. Application to Sensitivity Enhancement and NOE Measurement, J. Am. Chem. Soc., 115, 12593–12594, 1993.
Grzesiek, S., Anglister, J., Ren, H., and Bax, A.: 13C line narrowing by 2H decoupling in 2H/13C/15N-enriched Proteins. Application to triple resonance 4D connectivity of sequential amides, J. Am. Chem. Soc., 115, 4369–4370, 1993.
Grzesiek, S., Wingfield, P., Stahl, S., Kaufman, J. D., and Bax, A.: Four-dimensional 15N-separated NOESY of slowly tumbling perdeuterated 15N-enriched proteins. Applications to HIV-1 Nef., J. Am. Chem. Soc., 117, 9594–9595, 1995.
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Rio, J. F., Wiebe, M., Peterson, P., Gerard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., and Oliphant, T. E.: Array programming with NumPy, Nature, 585, 357–362, https://doi.org/10.1038/s41586-020-2649-2, 2020.
Helmus, J. J. and Jaroniec, C. P.: Nmrglue: an open source Python package for the analysis of multidimensional NMR data, J. Biomol. NMR, 55, 355–367, https://doi.org/10.1007/s10858-013-9718-x, 2013.
Hunter, J. D.: Matplotlib: A 2D graphics environment, Comput. Sci. Eng., 9, 90–95, https://doi.org/10.1109/mcse.2007.55, 2007.
Hwang, T. L., van Zijl, P. C. M., and Garwood, M.: Broadband adiabatic refocusing without phase distortion, J. Magn. Reson., 124, 250–254, 1997.
Ikura, M., Kay, L. E., and Bax, A.: A novel approach for sequential assignment of 1H, 13C, and 15N spectra of larger proteins: heteronuclear triple-resonance three-dimensional NMR spectroscopy. application to calmodulin, Biochemistry, 29, 4659–4667, 1990.
Kay, L. E., Clore, G. M., Bax, A., and Gronenborn, A. M.: Four-dimensional heteronuclear triple-resonance NMR spectroscopy of interleukin-1B in solution, Science, 249, 411–414, 1990.
Kay, L. E., Keifer, P., and Saarinen, T.: Pure Absorption Gradient Enhanced Heteronuclear Single Quantum Correlation Spectroscopy with Improved Sensitivity, J. Am. Chem. Soc., 114, 10663–10665, 1992.
Koharudin, L. M. I., Bonvin, A., Kaptein, R., and Boelens, R.: Use of very long-distance NOEs in a fully deuterated protein: an approach for rapid protein fold determination, J. Magn. Reson., 163, 228–235, https://doi.org/10.1016/s1090-7807(03)00149-6, 2003.
Lee, W., Tonelli, M., and Markley, J. L.: NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy, Bioinformatics, 31, 1325–1327, https://doi.org/10.1093/bioinformatics/btu830, 2015.
Lemaster, D. M. and Richards, F. M.: NMR sequential assignment of Escherichia-coli thioredoxin utilizing random fractional deuteration, Biochemistry, 27, 142–150, https://doi.org/10.1021/bi00401a022, 1988.
Marion, D., Driscoll, P. C., Kay, L. E., Wingfield, P. T., Bax, A., Gronenborn, A. M., and Clore, G. M.: Overcoming the overlap problem in the assignment of 1H NMR spectra of larger proteins by use of three-dimensional heteronuclear 1H-15N Hartmann-Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1β, Biochemistry, 28, 6150–6156, 1989a.
Marion, D., Kay, L. E., Sparks, S. W., Torchia, D. A., and Bax, A.: Three-dimensional heteronuclear NMR of 15N-labeled proteins, J. Am. Chem. Soc., 111, 1515–1517, 1989b.
Oschkinat, H., Griesinger, C., Kraulis, P. J., Sorensen, O. W., Ernst, R. R., Gronenborn, A. M., and Clore, G. M.: 3-Dimensional NMR spectroscopy of a protein in solution, Nature, 332, 374–376, 1988.
Pervushin, K., Riek, R., Wider, G., and Wuthrich, K.: Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution, Proc. Natl. Acad. Sci. USA, 94, 12366–12371, 1997.
Pervushin, K. V., Wider, G., and Wuthrich, K.: Single transition-to-single transition polarization transfer (ST2-PT) in [N15, H1]-TROSY, J. Biomol. NMR, 12, 345–348, 1998.
Robertson, A., Ying, J., and Bax, A.: Time domain data, acquisition parameters, and processing scripts of 4D NMR data, Zenodo, available at: https://zenodo.org/record/4625615, last access: 9 April 2021.
Rovnyak, D., Frueh, D. P., Sastry, M., Sun, Z. Y. J., Stern, A. S., Hoch, J. C., and Wagner, G.: Accelerated acquisition of high resolution triple-resonance spectra using non-uniform sampling and maximum entropy reconstruction, J. Magn. Reson., 170, 15–21, https://doi.org/10.1016/j.jmr.2004.05.016, 2004.
Torchia, D. A., Sparks, S. W., and Bax, A.: Delineation of Alpha-Helical Domains in Deuteriated Staphylococcal Nuclease By 2d Noe Nmr-Spectroscopy, J. Am. Chem. Soc., 110, 2320–2321, 1988.
Tugarinov, V., Muhandiram, R., Ayed, A., and Kay, L. E.: Four-dimensional NMR spectroscopy of a 723-residue protein: Chemical shift assignments and secondary structure of malate synthase G, J. Am. Chem. Soc., 124, 10025–10035, 2002.
Tugarinov, V., Hwang, P. M., and Kay, L. E.: Nuclear magnetic resonance spectroscopy of high-molecular-weight proteins, Annu. Rev. Biochem., 73, 107–146, 2004.
Tugarinov, V., Choy, W. Y., Orekhov, V. Y., and Kay, L. E.: Solution NMR-derived global fold of a monomeric 82-kDa enzyme, Proc. Natl. Acad. Sci. USA, 102, 622–627, 2005a.
Tugarinov, V., Kay, L. E., Ibraghimov, I., and Orekhov, V. Y.: High-resolution four-dimensional H-1-C-13 NOE spectroscopy using methyl-TROSY, sparse data acquisition, and multidimensional decomposition, J. Am. Chem. Soc., 127, 2767–2775, https://doi.org/10.1021/ja044032o, 2005b.
Vogeli, B., Segawa, T. F., Leitz, D., Sobol, A., Choutko, A., Trzesniak, D., van Gunsteren, W., and Riek, R.: Exact Distances and Internal Dynamics of Perdeuterated Ubiquitin from NOE Buildups, J. Am. Chem. Soc., 131, 17215–17225, https://doi.org/10.1021/ja905366h, 2009.
Vogeli, B., Kazemi, S., Guntert, P., and Riek, R.: Spatial elucidation of motion in proteins by ensemble-based structure calculation using exact NOEs, Nat. Struct. Mol. Biol., 19, 1053–1058, https://doi.org/10.1038/nsmb.2355, 2012.
Vuister, G. W., Boelens, R., and Kaptein, R.: Non-selective 3-dimensional NMR spectroscopy – The 3D NOE-HOHAHA experiment, J. Magn. Reson., 80, 176–185, 1988.
Wagner, G.: Prospects for NMR of large proteins, J. Biomol. NMR, 3, 375–385, 1993.
Wüthrich, K.: NMR of Proteins and Nucleic Acids, John Wiley & Sons, New York, USA, 1986.
Xia, Y. L., Sze, K. H., and Zhu, G.: Transverse relaxation optimized 3D and 4D N-15/N-15 separated NOESY experiments of N-15 labeled proteins, J. Biomol. NMR, 18, 261–268, https://doi.org/10.1023/a:1026590201757, 2000.
Ying, J., Delaglio, F., Torchia, D. A., and Bax, A.: Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data, J. Biomol. NMR, 68, 101–118, 2017.
Ying, J. F., Barnes, C. A., Louis, J. M., and Bax, A.: Importance of time-ordered non-uniform sampling of multidimensional NMR spectra of A beta(1-42) peptide under aggregating conditions, J. Biomol. NMR, 73, 429–441, https://doi.org/10.1007/s10858-019-00235-7, 2019.
Zuiderweg, E. R. P. and Fesik, S. W.: Heteronuclear Three-Dimensional NMR Spectroscopy of the Inflammatory Protein C5a, Biochemistry, 28, 2387–2391, 1989.
Zuiderweg, E. R. P., Petros, A. M., Fesik, S. W., and Olejniczak, E. T.: 4-Dimensional [C-13, H-1, C-13, H-1] Hmqc-Noe-Hmqc Nmr-Spectroscopy – Resolving Tertiary Noe Distance Constraints in the Spectra of Larger Proteins, J. Am. Chem. Soc., 113, 370–372, https://doi.org/10.1021/ja00001a060, 1991.