Articles | Volume 2, issue 1
Magn. Reson., 2, 129–138, 2021

Special issue: Robert Kaptein Festschrift

Magn. Reson., 2, 129–138, 2021

Research article 13 Apr 2021

Research article | 13 Apr 2021

Four-dimensional NOE-NOE spectroscopy of SARS-CoV-2 Main Protease to facilitate resonance assignment and structural analysis

Four-dimensional NOE-NOE spectroscopy of SARS-CoV-2 Main Protease to facilitate resonance assignment and structural analysis
Angus J. Robertson, Jinfa Ying, and Ad Bax Angus J. Robertson et al.
  • Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA

Correspondence: Ad Bax (


Resonance assignment and structural studies of larger proteins by nuclear magnetic resonance (NMR) can be challenging when exchange broadening, multiple stable conformations, and 1H back-exchange of the fully deuterated chain pose problems. These difficulties arise for the SARS-CoV-2 Main Protease, a homodimer of 2 × 306 residues. We demonstrate that the combination of four-dimensional (4D) TROSY-NOESY-TROSY spectroscopy and 4D NOESY-NOESY-TROSY spectroscopy provides an effective tool for delineating the 1H–1H dipolar relaxation network. In combination with detailed structural information obtained from prior X-ray crystallography work, such data are particularly useful for extending and validating resonance assignments as well as for probing structural features.

1 Introduction

The extension of conventional two-dimensional 1H–1H NMR spectroscopy of natural proteins (Wüthrich, 1986) to three-dimensional (3D) homonuclear NMR experiments offered the ability to simplify spectral analysis by removing resonance overlap (Vuister et al., 1988; Oschkinat et al., 1988) and by providing access to a direct, more detailed analysis of 1H–1H dipolar cross-relaxation networks. In particular, the homonuclear 3D NOE-NOE experiment (Boelens et al., 1989; Breg et al., 1990) not only decreased resonance overlap, it also directly elucidated spin-diffusion pathways. This information complemented and validated the elegant relaxation matrix analysis of spin diffusion (Boelens et al., 1988).

Such homonuclear 1H 3D experiments and analysis strategies were soon followed by a myriad of heteronuclear 3D experiments that required isotopic enrichment and therefore cloning and bacterial overexpression (Marion et al., 1989b; Zuiderweg and Fesik, 1989; Ikura et al., 1990; Marion et al., 1989a; Wagner, 1993). Most of these heteronuclear experiments simply served to disperse the regular 1H–1H 2D spectrum into a third dimension, thereby removing spectral overlap but providing little or no new information on the all-important 1H–1H spin-diffusion pathways. The 3D NOESY-HMQC experiment (Marion et al., 1989b; Zuiderweg and Fesik, 1989) subsequently was extended to four dimensions (4D), thereby dispersing the conventional 2D 1H–1H NOESY experiment into two additional dimensions that correspond to the chemical shifts of the nuclei to which each of the protons is covalently bound (Kay et al., 1990; Clore et al., 1991; Zuiderweg et al., 1991).

These multi-dimensional experiments provided a tremendous degree of spectral simplification, in particular after appropriate analysis software became available. However, it also quickly became clear that extension to large, slowly tumbling proteins was hampered by low signal to noise, caused by the relative inefficiency of the magnetization transfer steps when the dimensionality of a spectrum is increased. This decrease in sensitivity was remedied by generating the protein in a highly perdeuterated state while keeping the solvent-exchangeable backbone amide protons protonated (Torchia et al., 1988; Lemaster and Richards, 1988). Combining the perdeuteration approach with both the triple-resonance assignment strategy (Grzesiek et al., 1993) and the subsequently introduced powerful TROSY line-narrowing method (Pervushin et al., 1997) made it possible to assign and analyze the structure of quite large proteins, as exemplified by the 723-residue protein malate synthase G (Tugarinov et al., 2002, 2005a). The sensitivity gained by perdeuteration, enabling the recording of 4D 15N-separated NOE spectra, also was key in solving the structure of a HIV-1 accessory protein that had been too challenging for analysis by more conventional methods (Grzesiek et al., 1995). Zhu and co-workers introduced a TROSY-NOESY-TROSY version of Grzesiek's 4D NOESY experiment which, illustrated for a partially deuterated 27 kDa protein, yielded further improved sensitivity and intrinsic 1H and 15N line widths (Xia et al., 2000). Their implementation relied on eight-step phase cycling, thereby limiting digitization of the time-domain data and unable to exploit the improved TROSY relaxation in the indirect dimensions. Diercks et al. (2010) introduced an elegant method to suppress diagonal signals from a 4D TROSY-NOESY-TROSY spectrum (Diercks et al., 2010), a particularly useful feature when spectral resolution is limited. However, we opted not to use this implementation as, at the long mixing times used, the diagonal resonances serve as convenient reference anchors during analysis and are sufficiently attenuated such as not to obscure nearby peaks in our 4D spectra that are of very high digital resolution.

In the present report, we merge the above-mentioned prior advances, 3D NOE-NOE and 3D 15N-separated NOESY, into a 4D experiment. In combination with extensive perdeuteration and gradient-enhanced encoding to enable a four-step phase cycle as well as non-uniform sampling (NUS) (Rovnyak et al., 2004), the experiments take better advantage of the improved resolution afforded by 4D NMR. We demonstrate the utility of the experiments by applying them to the study of the main protease of SARS-CoV-2 (Mpro), which is the virus responsible for coronavirus-2019 disease (COVID-19). Mpro, also known as 3CLpro or Nsp5, is a homodimeric cysteine protease of 2 × 306 residues that does not have closely related mammalian homologues and is therefore an intense target for drug development, with a promising inhibitor now entered in a phase I clinical trial (Boras et al., 2021). Its NMR analysis is challenging, not only for its large size (67.6 kD), but also because of the presence of a minor conformer associated with the cis-isomer of one of its 13 Pro residues (P184), the difficulty in back-exchanging all backbone amide protons when the protein is expressed in 2H2O, and the presence of intermediate timescale motions that lead to exchange broadening in the vicinity of the protein's active site. Here we focus on an enzyme variant where the catalytic Cys145 residue has been mutated to Ala (MC145Apro), a construct that is stable for multiple weeks at the high concentrations required for NMR spectroscopy. The assignment process and a full structural analysis of the protein will be presented elsewhere. The focus of the present work is on technical innovations, including recording two types of 4D NOE-based NMR spectra that proved invaluable both for the validation of resonance assignments as well as the subsequent structural analysis.

2 Methods and experiments

2.1 Protein production

The gene encoding a C145A variant of Mpro (MC145Apro) with an N-terminal affinity (and solubility) tag was synthesized by GenScript (USA) and then cloned into a Pet24a+ plasmid between BamH1 and Xho1 restriction sites. The fusion protein encoded for 6His tag – GB1–SG rich linker–TEV cleavage site – MC145Apro and was purified according to methods collectively developed by the COVID-19 NMR consortium (Altincekic, 2021). In brief, following cell culture and harvesting (see Supplement), the cell lysate was passed down a 6His-affinity column (IMAC) and eluted in a small volume; the solubility tag was cleaved off to generate a native N-terminus; the reaction mix was then passed through an IMAC column to remove uncleaved protein before size separation on a Sephadex G75 column. The resulting, extensively perdeuterated, 2H,15N-MC145Apro homodimer was used for experiments and, throughout the recombinant protein expression, extensive care was taken to achieve a high ( 98 %) level of deuteration of the non-exchangeable hydrogens. For more details, see the Supplement.

2.2 Recording of NMR data

Spectra were acquired on a sample containing 1.8 mM (0.9 mM dimer) 2H, 15N-MC145Apro in 10 mM sodium phosphate, pH 7.0, 0.5 mM TCEP, 3 % v/v 2H2O and 0.3 mM sodium trimethylsilylpropanesulfonate (DSS; as an internal chemical shift reference), in a 300 µL Shigemi microcell. All experiments were recorded at 25 C on an 800 MHz Bruker Avance III spectrometer, equipped with a 5 mm TCI probe containing a triple-axis gradient accessory, and running TopSpin software version 3.1.

Figure 1Pulse schemes for four-dimensional (a) TROSY-NOESY-TROSY and (b) NOESY-NOESY-TROSY experiments. The filled and open rectangular bars on the 1H and 15N channels represent 90 and 180 pulses, respectively. The open bars in green represent composite 180 pulses consisting of 90x207y90x for 1H and 90x216y90x for 15N. The filled shaped 1H pulses correspond to selective EBURP2 (labeled e) and time-reversed EBURP2 (labeled e*) pulses (Geen and Freeman, 1991), while the open 1H-shaped pulses represent 90 water-flipback pulses (center lobe of a sinc profile, 1.1 ms duration at 800 MHz) (Grzesiek and Bax, 1993). The wide filled rectangular boxes denote 90 water-flipback pulses (also 1.1 ms duration at 800 MHz). Unless indicated otherwise, all pulses were applied along x. The following delays were used for the initial INEPT and TROSY transfers: δ= 2.1 ms, δ1= 2.1 ms, δ2= 2.5 ms. The 1H chemical shift evolution during the delay τ= 0.181 ms was compensated by offsetting the last pair of 1H and 15N 180 pulses by ε=τ/2 to avoid the linear phase error in the t4 dimension. (a) For the TROSY-NOESY-TROSY: NOE mixing time Tm= 200 ms; phase cycling: φ1= y, φ2=x, x, x, x, φ3=y, y, φ4=y, φ5=y, φ6=y, φ7=y, φrec=y, y, y, y; gradients were sine-bell or rectangular shaped (as depicted in the figure) with durations G1,2,3,4,5,6,7,8,9,10,11,12,13,14,15= 0.977, 1.2, 0.4, 0.4, 0.986, 0.977, 0.081, 1.7, 0.977, 1.2, 0.4, 0.4, 0.986, 0.977, and 0.081 ms, z strengths of 21.7, 28.7, 25.9, 32.9, 2.1, 25.9, 29.4, 30.8, 21.7, 28.7, 31.5, 38.5, 2.1, 25.9, and 35.0 G cm−1, and additional x and y strengths of 22.5, 27.5, and 25.0 G cm−1 for G11, G12, and G15, respectively. The duration of decoding pulses G7 (G15) was empirically optimized for maximum signal and can differ from the theoretical value derived from the gyromagnetic ratios of 15N and 1H and the encoding pulses G3+G4 (G11+G12) by several microseconds due to rise and fall times of short gradient pulses. Quadrature detection in t3 (t1) was achieved using the Echo-AntiEcho scheme (Kay et al., 1992) by inverting the encoding gradient G11 and G12 (G3 and G4) together with φ6 and φ7 (φ4 and φ5) to obtain the second FID for every t3 (t1) increment. The t2 dimension was acquired using States-TPPI by incrementing φ2 by 90. (b) For the NOESY-NOESY-TROSY, the selective EBURP2 and time-reversed EBURP2 pulses have a duration of 1.0 ms at 800 MHz, centered at 8.3 ppm, thereby exciting the amide protons downfield from the water resonance without perturbing any upfield exchangeable protons, or residual aliphatic protons resulting from imperfect perdeuteration. NOE mixing times, Tm1= 50 ms; Tm2= 300 ms. Phase cycling: φ1=x, x, x, x, φ2=x-π/4, φ3=y, y, φ4=y, φ5=y, φrec=y, y, y, y. Gradients were sine-bell or rectangular shaped with durations G1,2,3,4,5,6,7,8,9= 1.7, 1.2, 0.977, 1.2, 0.4, 0.4, 0.986, 0.977, and 0.081 ms, z strengths of 20.3, 30.8, 21.7, 28.7, 31.5, 38.5, 2.1, 25.9, and 35.0 G cm−1, and additional x and y strengths of 22.5, 27.5 and 25.0 G cm−1 for G5, G6, and G9, respectively. The duration of G9 was empirically optimized for maximum signal. Quadrature detection in t3 was achieved using the Echo-AntiEcho scheme by inverting the encoding gradient G5 and G6 together with the φ4 and φ5 to obtain the second FID for every t3 increment, while States-TPPI was used to obtain quadrature in the t1 dimension by incrementing the φ1 pulse phase by 90 and for t2 by incrementing φ2 by 90. Pulse sequence code and parameter files can be downloaded from


Considering that the measurements are carried out for perdeuterated protein, the spectral windows in the indirect 1H dimensions were limited to the region downfield from the water resonance, and EBURP2 pulses (Geen and Freeman, 1991) were used where needed to prevent excitation of protons outside of this region. For the 4D TROSY-NOESY-TROSY experiment (Fig. 1a), only a small fraction (0.54 %) of the full data matrix, consisting of 1536* (1H, t4, 119.8 ms) × 90* (15N, t3, 35.1 ms) × 91* (1H, t2, 20.0 ms) × 90* (15N, t1, 35.1 ms) complex points, was recorded (Table S1 in the Supplement) using nonuniform sampling for the indirect dimensions. Selection of the sampled data points was unweighted, randomly distributed, and without time ordering. The sampling schedule (included in the uploaded data sets; see Data availability) was automatically generated by Topspin 3.1 when the experiment started, with the random seed left to its default value (54 321) and the T2 for each indirect dimension set to 1 s to avoid significant weighting of the sampling schedule. Specifically, a total of 31896 FIDs along the directly detected t4 dimension were recorded. Using four scans per FID for phase cycling and an interscan delay of 2.07 s, the total measurement time was approximately 88 h, but 3-fold shorter would have sufficed (see Concluding remarks).

Nonstandard processing was needed for the TROSY-NOESY-TROSY experiment because the spectrum was recorded with sensitivity-enhanced gradient selection in the 15N t1 evolution period that preceded the NOE mixing (Xia et al., 2000). Specifically, the 4D NUS data set was first sorted and expanded according to the sampling schedule using the nusExpand.tcl script within the NMRPipe software package (Delaglio et al., 1995). The expanded data were then converted to the NMRPipe format, with the quadrature mode for t3 set to Echo-AntiEcho, while the quadrature mode for t1 was temporarily set to Complex. After the conversion, the 4D matrix needs to be transposed to enable use of the NMRPipe macro bruk_ranceA.M to correctly reshuffle the data, turning the phase-modulated t1 dimension into conventional amplitude-modulated data prior to processing as regular, complex data. This transposition is accomplished by reading in the NMRPipe-formatted matrix with the z axis along the t2 dimension, application of the macro, and restoring the data to its original axis order prior to regular processing, with the full script included with the raw time-domain data sets (see Data availability). For the processing, the direct dimension was apodized with a squared, shifted sine-bell window, spanning from 72 to 176.4, whereas an additional 15 Hz exponential line broadening was used to better match the apodization window to the natural decay of the signal, thereby improving the signal-to-noise ratio (S/N). This was followed by zero filling and Fourier transformation. Subsequently, the indirect data points that were not experimentally sampled were reconstructed using the SMILE (Ying et al., 2017) program, and the reconstructed data were further processed in NMRPipe. To enhance the spectral resolution, by default the acquisition times in all indirect dimensions were extended by 50 % during the SMILE reconstruction, leading to an effective sampling sparsity of 0.16 %. The data matrix for the final reconstructed 4D spectrum consists of 614 (1H, F4, 6.3 Hz per point) × 512 (15N, F3, 5.0 Hz per point) × 512 (1H, F2, 8.9 Hz per point) × 512 (15N, F1, 5.0 Hz per point) real points (see Table S1).

The full time-domain data matrix of the 4D NOESY-NOESY-TROSY experiment (Fig. 1b) consists of 1536* (1H, t4, 95.8 ms) × 90* (15N, t3, 35.1 ms) × 60* (1H, t2, 12.0 ms) × 60* (1H, t1, 12.0 ms) complex points (Table S1). An unweighted, random NUS sampling scheme with a sparsity of 1.69 % (corresponding to 43 856 t4 FIDs) was used to record a small subset of the data points. Using an interscan delay of 1.77 s and four scans per FID, the total experimental time was approximately 110 h, but 3-fold shorter would have sufficed (see Concluding remarks). The data were processed and reconstructed in the same manner as described above with 50 % extension of all indirect dimensions during the SMILE reconstruction, resulting in an effective sparsity of 0.50 % and a final spectral matrix size of 492 (1H, F4, 7.8 Hz per point) × 512 (15N, F3, 4.7 Hz per point) × 512 (1H, F2, 13.9 Hz per point) × 512 (1H, F1, 13.9 Hz per point) real points (Table S1). Note that since in the NOESY-NOESY-TROSY experiment the data were recorded using the Echo-AntiEcho mode (Kay et al., 1992) only in the t3 dimension, immediately preceding acquisition, the bruk_ranceA.M macro was not needed after the conversion of the expanded NUS data. The residual in-phase axial peaks along the F2 dimension were treated as real peaks and optimally reconstructed by SMILE to suppress the sampling artifacts of the axial signals from spreading to the regions with NOE peaks. The processing macros used for both 4D spectra are included with the raw time-domain data (see Data availability).

2.3 Spectrum analysis

Spectra were processed using NMRPipe software (Delaglio et al., 1995); peak picking and spectrum analysis was performed using SPARKY software (Goddard and Kneller, 2008; Lee et al., 2015) as well as NMRDraw (Delaglio et al., 1995). Programs for visualization and analysis were written using freely available python libraries (Hunter, 2007; Harris et al., 2020) as well as NMR-specific python libraries (Helmus and Jaroniec, 2013).

3 Results and discussion

Two types of complementary 4D NOE experiments were recorded: (1) 4D TROSY-NOESY-TROSY and (2) 4D NOESY-NOESY-TROSY (Fig. 1). While the former is very similar to the HMQC-NOESY-TROSY experiment used recently for a single α-helical domain with a long rotational correlation time (Barnes et al., 2019), the 4D NOESY-NOESY-TROSY experiment extends earlier work by Kaptein and co-workers (Boelens et al., 1989; Breg et al., 1990).

3.1 Recording and analysis of the 4D TROSY-NOESY-TROSY spectrum

The rotational correlation time of the C145A variant of Mpro (MC145Apro) at 25 C is ca. 27 ns, and consequently, transverse relaxation is rapid for both 15N and 1HN nuclei. For this reason, it proved beneficial to substitute a TROSY element for the HMQC or HSQC segment that was previously used for such measurements (Kay et al., 1990; Barnes et al., 2019). Even though the TROSY element only utilizes half of the amide 1HN magnetization present at the start of the pulse sequence, combining its 15N evolution with sensitivity-enhanced gradient selection during the subsequent t2 evolution period (Fig. 1a) limits the loss to 2 or even somewhat less when taking the gain from the 15N Boltzmann magnetization into account (Pervushin et al., 1998). A 2D TROSY spectrum (Fig. S1) of this sample allowed identification of 261 backbone amide peaks out of 293 non-proline residues, suggesting the feasibility of implementing the TROSY version of the 4D NOESY experiment. Conformational exchange on a timescale that results in extensive line broadening and incomplete back-exchange of amides when the protein was purified in 1H2O are the primary causes of the absence of the ca. 30 amide signals.

The high quality and S/N of the TROSY-HSQC spectrum (Fig. 2b) suggested the feasibility of implementing the TROSY version of the 4D NOESY experiment. Combined with the enhanced relaxation properties during t1 and t2 evolution of the TROSY-selected coherence, we found experimentally that spectral quality attainable for MC145Apro with the 4D TROSY-NOESY-TROSY was better than with the HMQC-NOESY-TROSY version of the experiment, consistent with the previous report that the TROSY implementation improved both the sensitivity and resolution over the 4D HSQC-NOESY-HSQC (Xia et al., 2000). Figure 2 shows expanded regions of six (F1, F2) cross sections through the 4D spectrum, each orthogonal to the (F3, F4) frequencies of the six amide correlations that are highlighted in a section of the regular 2D 1H–15N TROSY-HSQC spectrum of Fig. 2b. A total of 231 peaks, out of the 261 peaks in the 2D TROSY spectrum, can be detected as (semi-)resolvable diagonal peaks in the projected 15N–1H (F3,F4) plane (data not shown). These numbers do not include the doubling of resonances associated with isomerization of P184.

Figure 2Illustration of amide–amide NOEs in perdeuterated, amide-protonated SARS-CoV-2 Main Protease, observed by 4D TROSY-NOESY-TROSY. (a) Ribbon diagram depicting the backbone homodimeric X-ray structure (PDB: 6LU7), with colors marking the regions that are highlighted in (F1, F2) cross sections taken at the (F3, F4) coordinates of (c) L232 and M235 (blue), (d) L67 and Q69 (green) and (e) G195 and D197 (red). These resonances are marked in (b), which represents the most crowded region of the 800 MHz 1H–15N TROSY-HSQC (the full spectrum is shown in Fig. S1).


The cross sections exemplify the power of 4D analysis for three types of secondary structure: α-helix (Fig. 2c), β-sheet (Fig. 2d), and a loop region (Fig. 2e). Due to the long NOE mixing time used in this experiment (200 ms), substantial spin diffusion occurs, which results in numerous NOE correlations for each amide. For example, α-helical residues L232 and M235 not only show NOE interactions with one another, but also share NOE cross peaks to V233 and A234, with M235 even showing a weak cross peak to N231. Such correlations are particularly useful for validating the assignments obtained from the limited number of triple-resonance backbone assignment experiments that are applicable to larger proteins such as Mpro.

The amides of L67 and Q69 in strand β4 only share a single NOE, to sequential residue V68, but they show valuable long-range NOEs to amide protons in strands β1 (C22) and β5 (L75). G195 and D197, located in the long loop that connects strand β13 to helix α6, have an NOE to one another as well as sequential NOEs but show no long-range interactions, consistent with the X-ray structure (Douangamath et al., 2020). However, NOEs from L67 or Q69 to T21 or Q19 are not observed, despite close proximity, due to the minimal back-exchange of amide protons in the β1 strand.

It is interesting to compare the diagonal peak intensities in these various cross sections of the TROSY-NOESY-TROSY spectrum. Diagonal intensity is a function of the amount of amide 1H z magnetization present at the start of the pulse sequence, i.e., it depends on the non-selective longitudinal relaxation time of the amide proton, but also on the attenuation of this magnetization during the NOE mixing time, in other words, on the selective longitudinal relaxation time which is dominated by J(0) spectral density terms. The latter dominate the differences in diagonal intensity seen in the various cross sections. For example, the helical amides of L232 and M235 rapidly lose their magnetization to their proximate sequential amide neighbors, separated by ca. 2.7 Å, that each are in close contact with other neighboring protons. By contrast, none of the L67, Q69, G195 and D197 amides are closer than 3.7 Å from any neighboring protonated amide in the 1.25 Å X-ray structure of Mpro (Douangamath et al., 2020), causing their diagonal intensities to remain high.

3.2 Recording and analysis of the 4D NOESY-NOESY-TROSY spectrum

As highlighted by the work of Kaptein and co-workers, 3D NOE-NOE experiments provided an effective method for studying the 1H–1H cross-relaxation network in proteins in more detail. Here, we extend this powerful experiment to four dimensions, making it more straightforward to analyze such a spectrum while limiting the relaxation pathways by perdeuteration of the protein.

The pulse scheme of this 4D NOESY-NOESY-TROSY is shown in Fig. 1b. It represents a straightforward extension of the original NOE-NOE 3D experiment (Boelens et al., 1989) but with the detection period substituted by the gradient-enhanced 2D 1H–15N TROSY scheme (Pervushin et al., 1998). The latter enhances the attainable spectral resolution in the t3 and t4 dimensions, while dispersing the detected 1HN resonances in the 15N dimension. A number of minor technical considerations are also relevant in this respect. (1) First, in order to maximize the number of (t1, t2, t3) data points sampled, the phase cycling of the 4D experiment was reduced to four steps, and the observed spectral window was restricted to the region downfield of the H2O resonance. To prevent bleeding in of several weaker imperfectly deuterated aliphatic or exchangeable resonances present in the upfield spectral region, selective-EBURP2 and reverse-EBURP2 pulses (Geen and Freeman, 1991) were used to also restrict the regions where 1H resonances were excited to those resonating downfield from the water resonance. As a result, no NOE peaks from a few amide protons resonating near water or upfield from water were observed. (2) Recording of a 4D NMR spectrum at adequate resolution requires the use of non-uniform sampling. High quality NUS reconstruction of a 4D NMR spectrum can be accomplished by the SMILE program (Ying et al., 2017) but this as well as most other NUS reconstruction software performs better if the various time domains are acquired in a manner that results in either a 0 or a 180 linear phase correction across the spectrum. For this purpose, and to ensure that the non-suppressed axial peaks can be optimally reconstructed, which requires 0 linear phase correction, it was preferable to insert a non-selective 90x207y90x composite 1H inversion pulse (highlighted as the green open bar in Fig. 1b), followed by a second such pulse that reverses any phase imperfections introduced by the first composite pulse (Hwang et al., 1997). Specifically, the φ1 phase cycling serves to eliminate axial peaks in the t1 dimension caused by pulse imperfection as well as T1 relaxation and amide exchange with solvent during Tm1, while also suppressing axial peaks in the t2 dimension resulting from T1 relaxation and water exchange during Tm2. To minimize the number of phase cycling steps, φ2 was not phase cycled. However, this resulted in small residual axial peaks along the F2 dimension caused by pulse imperfections. To ensure that these residual axial peaks were absorptive in the final spectrum, thereby simplifying SMILE NUS reconstruction, an echo is generated by the application of two composite 1H 180 pulses in order to suppress initial chemical shift evolution at t2=0, thereby eliminating the need for a linear phase correction. Considering that the real and imaginary components of the residual axial signals have the same amplitude, they result in a 45 phase error for the axial peaks in the F2 dimension. Shifting the φ2 phase by 45 ensures that the NOE and axial peaks both can be phased absorptive using the same phase correction, thus facilitating NUS processing.

Figure 3(F1, F2) cross sections through the 4D NOESY-NOESY-TROSY spectrum of MC145Apro, taken orthogonal to the (F3, F4) TROSY-HSQC plane at the 15N, 1H frequencies of (a) L232, (b) M235, (d) L67, (e) Q69), (g) G195, and (h) D197. These cross sections show diagonal resonances for amide protons that are within long-range contact of the selected amide, either through direct or indirect NOE transfer during the two mixing periods that have a total duration of 50 + 300 ms. Off-diagonal resonances in these cross sections correspond to NOE magnetization transfer during the 50 ms mixing period that separates the t1 and t2 evolution periods. Colors match those of the corresponding residues in Fig. 2. Expanded views of the structural elements (PDB: 6LU7) that gave rise to the observed NOEs are shown in panels (c), (f), and (i).


Compared to the 4D TROSY-NOESY-TROSY pulse scheme, the 4D NOESY-NOESY-TROSY experiment avoids the lossy magnetization transfer step from 1H to 15N and back (leading to a slightly larger number of 241 diagonal peaks on the 15N–1H (F3, F4) projected plane, compared to 231 for TROSY-NOESY-TROSY). Instead, its magnetization is simply transferred, in part, to its nearest neighbors by cross-relaxation during the first NOE mixing period of duration Tm1=50 ms. There is virtually no loss in total spin polarization summed over the initial “starting spin”, whose t1 evolution is monitored, and those of its immediate neighbors that are within cross-relaxation contact. As a result, the intrinsic sensitivity of such NOESY-NOESY-TROSY measurements is quite high, allowing the choice of a long duration of 300 ms for the second NOE mixing time, Tm2. During this second, much longer mixing time, the z magnetization distributes over considerable distances due to indirect transfers (Fig. 3). Even in this extensively perdeuterated protein, NOEs to nearly a dozen neighboring protons are observed on the diagonals of the (F1, F2) cross sections, taken at the same (15N, 1H) frequencies used for illustrating the utility of the 4D TROSY-NOESY-TROSY spectrum of Fig. 2. However, as pointed out by Boelens et al. (1989) and Breg et al. (1990), the NOE-NOE combination offers a wealth of new information on the cross-relaxation pathways that led to the long-distance NOEs, substantially aiding both the assignment and analysis of distance information. Below, we briefly highlight a few examples.

As expected, α-helical residue L232 shows intense cross peaks to both of its sequential neighbors, N231 and V233, as well as a weaker cross peak to F230. Despite the relatively short mixing time of only 50 ms that separates t1 and t2 evolution, the latter must result mostly from indirect transfer through N231, because N231 and F230 share an intense cross peak. So in effect, each cross section through the 4D spectrum shown in Fig. 3 corresponds to a 2D NOESY spectrum of a small, localized, region within the protein structure – making its analysis far simpler. For residues with few neighbors, direct NOE contacts between neighbors separated by as much as 4.5 Å give rise to quite intense cross peaks after 50 ms NOE mixing, as exemplified by the contacts between G195 and its A194 and T196 neighbors (Fig. 3g). A weaker cross peak between G195 and D197, at an interproton distance of 6.4 Å, appears not to be mediated by spin diffusion because the G195 and D197 panels (Fig. 3g) show no common strong NOE to any visible resonance. However, the possibility that the hydroxyl proton of T196 serves as a relay partner cannot be excluded.

The NOESY-NOESY-TROSY spectrum also shows multiple NOEs to sidechain amide protons that are not visible in the TROSY-NOESY-TROSY spectrum because the TROSY element does not select magnetization transfer for NH2 groups. For example, D197 shows long-range NOEs to the N133 carboxamide protons, whereas Q69 shows NOEs to both its own carboxamide protons and to those of Q74. The non-equivalent NH2 pairs are readily recognized by cross peak to diagonal peak intensity ratios that are close to one, owing to their short interproton distance.

4 Concluding remarks

The spectra shown in this study were recorded during the summer of 2020, when access to campus facilities was strongly restricted due to COVID-19 pandemic mitigation efforts. These restrictions allowed for much lengthier acquisition of spectra than commonly used, for a total of 8 d for the two 4D spectra. As a benefit of NUS reconstruction, it is possible to generate spectra of the same resolution recorded in any fraction of that time. Alternatively, we can discard the data recorded at the longest values of t1, t2, and t3. Indeed, processing the same time-domain data sets but shortening the time domains using a previously described protocol that considers the total normalized length of the 3D (t1, t2, t3) time-domain vector (Ying et al., 2019), using only one-third of the acquired time-domain data yields spectra that are very similar to the ones shown in Figs. 2 and 3, albeit at slightly lower resolution and signal to noise, due to the use of 3-fold less time-domain data. Nevertheless, the quality of the resulting spectra remains excellent, with near-identical information content (Figs. S2 and S3 in the Supplement).

Use of the lengthy data acquisition times needed to collect the 4D spectra requires a high stability sample, which in our case benefited from the C145A active site mutation, protecting the sample from auto-proteolysis. As with all NMR experiments, S/N is approximately proportional to sample concentration. Therefore, working at high concentrations benefits S/N of these experiments that involve multiple magnetization transfer steps, an issue that is particularly important for NOE experiments where magnetization from a single nucleus is distributed over many neighbors.

We note that the TROSY-NOESY-TROSY experiment used a long NOE mixing time of 200 ms, such as to increase the number of observed connectivities by adding indirect NOE effects, including spin diffusion through hydroxyl protons (Koharudin et al., 2003), thereby aiding the assignment process. The use of a 50 ms NOE mixing period in the subsequent 4D NOESY-NOESY-TROSY experiment then provided a semi-quantitative measure of distance between these protons and their neighbors. Indeed, as pointed out by Kaptein and co-workers, recording of NOE-NOE spectra provides important experimental data on the pathway of magnetization transfer during NOE mixing. Such information could be used to convert these data into more quantitative distance information than the typical qualitative analysis of NOE intensities, potentially leading to the generation of higher resolution structures (Vogeli et al., 2009, 2012). Quantitative NOE interpretation traditionally relied on the recording of a series of NOE buildup data, which can become comparably time-consuming as the recording of 4D NMR spectra if resonance overlap is a limiting factor, as typically is the case for NOE spectra. This problem is further exacerbated by the spectral crowding of large proteins, particularly in the 1H dimension, and while 3D spectra may give higher signal-to-noise ratios than 4D spectra, downstream analysis frequently requires extensive disambiguation of overlapped peaks. Our study of MC145Apro shows that a large number of semi-quantitative NOE distances become accessible by recording of 4D NMR spectra on a perdeuterated larger protein with little or no ambiguity about the nuclei involved.

While the high signal to noise and spectral simplicity of working with perdeuterated proteins has long been recognized (Torchia et al., 1988; Lemaster and Richards, 1988; Grzesiek et al., 1993; Tugarinov et al., 2004) the number of structural restraints accessible used to be small. Our present study demonstrates that a much larger number of NOE interactions becomes available by the recording of 4D NOE spectra. Moreover, it highlights the exquisite detail and value of NOE-NOE interaction analysis explored by the Kaptein group and it demonstrates that this approach is highly suitable for the larger biomolecules and biomolecular complexes being explored today, in particular when using extensive perdeuteration. Therefore, we believe that the recording of high quality 4D NMR spectra of the type presented in this study is entirely practical and invaluable for the structural and functional analysis of large proteins and their complexes, with possible extension to the study of nucleic acids. We note, however, that in the absence of extensive deuteration the dilution of nuclear magnetization over sidechain resonances will strongly lower the sensitivity of the experiment, which is further exacerbated by decreased effectiveness of TROSY-based line narrowing in such samples. On the other hand, adaptations of the NOESY-NOESY-TROSY experiment to methyl-protonated but otherwise perdeuterated proteins (Tugarinov et al., 2005b) are expected to be readily feasible.

Data availability

The raw Bruker NMR data sets including the acquisition parameters and NUS sampling lists, pulse programs, include file, and NMRPipe processing scripts are available for download from Zenodo: (Robertson et al., 2021).


The supplement related to this article is available online at:

Author contributions

AJR expressed and purified protein samples, collected and analyzed the data, and edited the manuscript; JY optimized pulse sequence parameterization and processing and edited the manuscript; AB supervised the project and wrote the manuscript.

Competing interests

The authors declare that they have no conflict of interest.

Special issue statement

This article is part of the special issue “Robert Kaptein Festschrift”. It is not associated with a conference.


We thank John M. Louis, Joseph Courtney, Yang Shen, James L. Baber and Dennis A. Torchia for helpful discussions. This work was supported by the Intramural Research Program of the NIDDK and by the Intramural Antiviral Target Program of the Office of the Director, NIH.

This study is dedicated to Robert Kaptein on the occasion of his 80th birthday.

Financial support

This research has been supported by the National Institute of Diabetes and Digestive and Kidney Diseases (grant nos. DK075023 and DK029046).

Review statement

This paper was edited by Isabella Felli and reviewed by four anonymous referees.


Altincekic, N.: Large-scale recombinant production of the SARS-CoV-2 proteome for high-throughput and structural biology applications, Frontiers in Molecular Biosciences, accepted, available at: recombinant-production-of-the-sars-cov-2-proteome-for-high-throughput-and-structural-biology-applications/, last access: 12 April 2021. 

Barnes, C. A., Shen, Y., Ying, J. F., Takagi, Y., Torchia, D. A., Sellers, J. R., and Bax, A.: Remarkable Rigidity of the Single alpha-Helical Domain of Myosin-VI As Revealed by NMR Spectroscopy, J. Am. Chem. Soc., 141, 9004–9017,, 2019. 

Boelens, R., Koning, T. M. G., and Kaptein, R.: Determination of Biomolecular Structures From Proton-Proton Noes Using a Relaxation Matrix Approach, J. Mol. Struct., 173, 299–311, 1988. 

Boelens, R., Vuister, G. W., Koning, T. M. G., and Kaptein, R.: Observation of spin diffusion in biomolecules by 3-dimensional NOE-NOE spectroscopy, J. Am. Chem. Soc., 111, 8525–8526,, 1989. 

Boras, B., Jones, R. M., Anson, B. J., Arenson, D., Aschenbrenner, L., Bakowski, M. A., Beutler, N., Binder, J., Chen, E., Eng, H., Hammond, H., Hammond, J., Haupt, R. E., Hoffman, R., Kadar, E. P., Kania, R., Kimoto, E., Kirkpatrick, M. G., Lanyon, L., Lendy, E. K., Lillis, J. R., Logue, J., Luthra, S. A., Ma, C., Mason, S. W., McGrath, M. E., Noell, S., Obach, R. S., O'Brien, M. N., O'Connor, R., Ogilvie, K., Owen, D., Pettersson, M., Reese, M. R., Rogers, T. F., Rossulek, M. I., Sathish, J. G., Shirai, N., Steppan, C., Ticehurst, M., Updyke, L. W., Weston, S., Zhu, Y., Wang, J., Chatterjee, A. K., Mesecar, A. D., Frieman, M. B., Anderson, A. S., and Allerton, C.: Discovery of a Novel Inhibitor of Coronavirus 3CL Protease for the Potential Treatment of COVID-19, bioRxiv, 2020.2009.2012.293498,, 2021. 

Breg, J. N., Boelens, R., Vuister, G. W., and Kaptein, R.: 3D NOE-NOE spectroscopy of proteins – Observation of sequential 3D NOE cross peaks in Arc repressor, J. Magn. Reson., 87, 646–651,, 1990. 

Clore, G. M., Kay, L. E., Bax, A., and Gronenborn, A. M.: Four-dimensional 13C/13C-edited nuclear Overhausesr enhancement spectroscopy of a protein in solution: Application to interleukin 1β, Biochemistry, 30, 12–18, 1991. 

Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A.: NMRpipe - a multidimensional spectral processing system based on Unix pipes, J. Biomol. NMR, 6, 277–293, 1995. 

Diercks, T., Truffault, V., Coles, M., and Millett, O.: Diagonal-Free 3D/4D HN,HN-TROSY-NOESY-TROSY, J. Am. Chem. Soc., 132, 2138–2139,, 2010. 

Douangamath, A., Fearon, D., Gehrtz, P., Krojer, T., Lukacik, P., Owen, C. D., Resnick, E., Strain-Damerell, C., Aimon, A., Abranyi-Balogh, P., Brandao-Neto, J., Carbery, A., Davison, G., Dias, A., Downes, T. D., Dunnett, L., Fairhead, M., Firth, J. D., Jones, S. P., Keeley, A., Keseru, G. M., Klein, H. F., Martin, M. P., Noble, M. E. M., O'Brien, P., Powell, A., Reddi, R. N., Skyner, R., Snee, M., Waring, M. J., Wild, C., London, N., von Delft, F., and Walsh, M. A.: Crystallographic and electrophilic fragment screening of the SARS-CoV-2 main protease, Nat. Commun., 11, 5047,, 2020. 

Geen, H. and Freeman, R.: Band-selective radiofrequency pulses, J. Magn. Reson., 93, 93–141, 1991. 

Goddard, T. D. and Kneller D. G.: Sparky 3, University of California, San Francisco, USA, 2008. 

Grzesiek, S. and Bax, A.: The Importance of Not Saturating H2O in Protein NMR. Application to Sensitivity Enhancement and NOE Measurement, J. Am. Chem. Soc., 115, 12593–12594, 1993. 

Grzesiek, S., Anglister, J., Ren, H., and Bax, A.: 13C line narrowing by 2H decoupling in 2H/13C/15N-enriched Proteins. Application to triple resonance 4D connectivity of sequential amides, J. Am. Chem. Soc., 115, 4369–4370, 1993. 

Grzesiek, S., Wingfield, P., Stahl, S., Kaufman, J. D., and Bax, A.: Four-dimensional 15N-separated NOESY of slowly tumbling perdeuterated 15N-enriched proteins. Applications to HIV-1 Nef., J. Am. Chem. Soc., 117, 9594–9595, 1995. 

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Rio, J. F., Wiebe, M., Peterson, P., Gerard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., and Oliphant, T. E.: Array programming with NumPy, Nature, 585, 357–362,, 2020. 

Helmus, J. J. and Jaroniec, C. P.: Nmrglue: an open source Python package for the analysis of multidimensional NMR data, J. Biomol. NMR, 55, 355–367,, 2013. 

Hunter, J. D.: Matplotlib: A 2D graphics environment, Comput. Sci. Eng., 9, 90–95,, 2007. 

Hwang, T. L., van Zijl, P. C. M., and Garwood, M.: Broadband adiabatic refocusing without phase distortion, J. Magn. Reson., 124, 250–254, 1997. 

Ikura, M., Kay, L. E., and Bax, A.: A novel approach for sequential assignment of 1H, 13C, and 15N spectra of larger proteins: heteronuclear triple-resonance three-dimensional NMR spectroscopy. application to calmodulin, Biochemistry, 29, 4659–4667, 1990. 

Kay, L. E., Clore, G. M., Bax, A., and Gronenborn, A. M.: Four-dimensional heteronuclear triple-resonance NMR spectroscopy of interleukin-1B in solution, Science, 249, 411–414, 1990. 

Kay, L. E., Keifer, P., and Saarinen, T.: Pure Absorption Gradient Enhanced Heteronuclear Single Quantum Correlation Spectroscopy with Improved Sensitivity, J. Am. Chem. Soc., 114, 10663–10665, 1992. 

Koharudin, L. M. I., Bonvin, A., Kaptein, R., and Boelens, R.: Use of very long-distance NOEs in a fully deuterated protein: an approach for rapid protein fold determination, J. Magn. Reson., 163, 228–235,, 2003. 

Lee, W., Tonelli, M., and Markley, J. L.: NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy, Bioinformatics, 31, 1325–1327,, 2015. 

Lemaster, D. M. and Richards, F. M.: NMR sequential assignment of Escherichia-coli thioredoxin utilizing random fractional deuteration, Biochemistry, 27, 142–150,, 1988. 

Marion, D., Driscoll, P. C., Kay, L. E., Wingfield, P. T., Bax, A., Gronenborn, A. M., and Clore, G. M.: Overcoming the overlap problem in the assignment of 1H NMR spectra of larger proteins by use of three-dimensional heteronuclear 1H-15N Hartmann-Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1β, Biochemistry, 28, 6150–6156, 1989a. 

Marion, D., Kay, L. E., Sparks, S. W., Torchia, D. A., and Bax, A.: Three-dimensional heteronuclear NMR of 15N-labeled proteins, J. Am. Chem. Soc., 111, 1515–1517, 1989b. 

Oschkinat, H., Griesinger, C., Kraulis, P. J., Sorensen, O. W., Ernst, R. R., Gronenborn, A. M., and Clore, G. M.: 3-Dimensional NMR spectroscopy of a protein in solution, Nature, 332, 374–376, 1988. 

Pervushin, K., Riek, R., Wider, G., and Wuthrich, K.: Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution, Proc. Natl. Acad. Sci. USA, 94, 12366–12371, 1997. 

Pervushin, K. V., Wider, G., and Wuthrich, K.: Single transition-to-single transition polarization transfer (ST2-PT) in [N15, H1]-TROSY, J. Biomol. NMR, 12, 345–348, 1998. 

Robertson, A., Ying, J., and Bax, A.: Time domain data, acquisition parameters, and processing scripts of 4D NMR data, Zenodo, available at:, last access: 9 April 2021. 

Rovnyak, D., Frueh, D. P., Sastry, M., Sun, Z. Y. J., Stern, A. S., Hoch, J. C., and Wagner, G.: Accelerated acquisition of high resolution triple-resonance spectra using non-uniform sampling and maximum entropy reconstruction, J. Magn. Reson., 170, 15–21,, 2004. 

Torchia, D. A., Sparks, S. W., and Bax, A.: Delineation of Alpha-Helical Domains in Deuteriated Staphylococcal Nuclease By 2d Noe Nmr-Spectroscopy, J. Am. Chem. Soc., 110, 2320–2321, 1988. 

Tugarinov, V., Muhandiram, R., Ayed, A., and Kay, L. E.: Four-dimensional NMR spectroscopy of a 723-residue protein: Chemical shift assignments and secondary structure of malate synthase G, J. Am. Chem. Soc., 124, 10025–10035, 2002. 

Tugarinov, V., Hwang, P. M., and Kay, L. E.: Nuclear magnetic resonance spectroscopy of high-molecular-weight proteins, Annu. Rev. Biochem., 73, 107–146, 2004. 

Tugarinov, V., Choy, W. Y., Orekhov, V. Y., and Kay, L. E.: Solution NMR-derived global fold of a monomeric 82-kDa enzyme, Proc. Natl. Acad. Sci. USA, 102, 622–627, 2005a. 

Tugarinov, V., Kay, L. E., Ibraghimov, I., and Orekhov, V. Y.: High-resolution four-dimensional H-1-C-13 NOE spectroscopy using methyl-TROSY, sparse data acquisition, and multidimensional decomposition, J. Am. Chem. Soc., 127, 2767–2775,, 2005b.  

Vogeli, B., Segawa, T. F., Leitz, D., Sobol, A., Choutko, A., Trzesniak, D., van Gunsteren, W., and Riek, R.: Exact Distances and Internal Dynamics of Perdeuterated Ubiquitin from NOE Buildups, J. Am. Chem. Soc., 131, 17215–17225,, 2009. 

Vogeli, B., Kazemi, S., Guntert, P., and Riek, R.: Spatial elucidation of motion in proteins by ensemble-based structure calculation using exact NOEs, Nat. Struct. Mol. Biol., 19, 1053–1058,, 2012. 

Vuister, G. W., Boelens, R., and Kaptein, R.: Non-selective 3-dimensional NMR spectroscopy – The 3D NOE-HOHAHA experiment, J. Magn. Reson., 80, 176–185, 1988. 

Wagner, G.: Prospects for NMR of large proteins, J. Biomol. NMR, 3, 375–385, 1993. 

Wüthrich, K.: NMR of Proteins and Nucleic Acids, John Wiley & Sons, New York, USA, 1986. 

Xia, Y. L., Sze, K. H., and Zhu, G.: Transverse relaxation optimized 3D and 4D N-15/N-15 separated NOESY experiments of N-15 labeled proteins, J. Biomol. NMR, 18, 261–268,, 2000. 

Ying, J., Delaglio, F., Torchia, D. A., and Bax, A.: Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data, J. Biomol. NMR, 68, 101–118, 2017. 

Ying, J. F., Barnes, C. A., Louis, J. M., and Bax, A.: Importance of time-ordered non-uniform sampling of multidimensional NMR spectra of A beta(1-42) peptide under aggregating conditions, J. Biomol. NMR, 73, 429–441,, 2019. 

Zuiderweg, E. R. P. and Fesik, S. W.: Heteronuclear Three-Dimensional NMR Spectroscopy of the Inflammatory Protein C5a, Biochemistry, 28, 2387–2391, 1989. 

Zuiderweg, E. R. P., Petros, A. M., Fesik, S. W., and Olejniczak, E. T.: 4-Dimensional [C-13, H-1, C-13, H-1] Hmqc-Noe-Hmqc Nmr-Spectroscopy – Resolving Tertiary Noe Distance Constraints in the Spectra of Larger Proteins, J. Am. Chem. Soc., 113, 370–372,, 1991. 

Short summary
NMR study of large proteins such as SARS-CoV-2 Main Protease can be challenging when exchange broadening, multiple stable conformations, and back-exchanging the fully deuterated chain pose problems. We demonstrate that 4D NMR, including an extension of 3D NOE-NOE spectroscopy, provides an effective tool for spectral analysis. In combination with X-ray coordinates, the 4D NMR data are particularly useful for extending and validating assignments and for probing structural features.