Four-dimensional NOE-NOE spectroscopy of SARS-CoV-2 Main Protease to facilitate resonance assignment and structural analysis

Abstract Resonance assignment and structural studies of larger proteins by nuclear magnetic resonance (NMR) can be challenging when exchange broadening, multiple stable conformations, and 1 H back-exchange of the fully deuterated chain pose problems. These difficulties arise for the SARS-CoV-2 Main Protease, a homodimer of 2  ×  306 residues. We demonstrate that the combination of four-dimensional (4D) TROSY-NOESY-TROSY spectroscopy and 4D NOESY-NOESY-TROSY spectroscopy provides an effective tool for delineating the 1 H– 1 H dipolar relaxation network. In combination with detailed structural information obtained from prior X-ray crystallography work, such data are particularly useful for extending and validating resonance assignments as well as for probing structural features.


Introduction
The extension of conventional two-dimensional 1 H-1 H NMR spectroscopy of natural proteins (Wüthrich, 1986) to three-dimensional (3D) homonuclear NMR experiments offered the ability to simplify spectral analysis by removing resonance overlap (Vuister et al., 1988;Oschkinat et al., 1988) and by providing access to a direct, more detailed analysis of 1 H-1 H dipolar cross-relaxation networks. In particular, the homonuclear 3D NOE-NOE experiment (Boelens et al., 1989;Breg et al., 1990) not only decreased resonance overlap, it also directly elucidated spin-diffusion pathways. This information complemented and validated the elegant relaxation matrix analysis of spin diffusion .
Such homonuclear 1 H 3D experiments and analysis strategies were soon followed by a myriad of heteronuclear 3D experiments that required isotopic enrichment and therefore cloning and bacterial overexpression (Marion et al., 1989b;Zuiderweg and Fesik, 1989;Ikura et al., 1990;Marion et al., 1989a;Wagner, 1993). Most of these heteronuclear experiments simply served to disperse the regular 1 H-1 H 2D spectrum into a third dimension, thereby removing spectral overlap but providing little or no new information on the all-important 1 H-1 H spin-diffusion pathways. The 3D NOESY-HMQC experiment (Marion et al., 1989b;Zuiderweg and Fesik, 1989) subsequently was extended to four dimensions (4D), thereby dispersing the conventional 2D 1 H-1 H NOESY experiment into two additional dimensions that correspond to the chemical shifts of the nuclei to which each of the protons is covalently bound Clore et al., 1991;Zuiderweg et al., 1991).
These multi-dimensional experiments provided a tremendous degree of spectral simplification, in particular after appropriate analysis software became available. However, it also quickly became clear that extension to large, slowly tumbling proteins was hampered by low signal to noise, caused by the relative inefficiency of the magnetization transfer steps when the dimensionality of a spectrum is increased. This decrease in sensitivity was remedied by generating the protein in a highly perdeuterated state while keeping the solventexchangeable backbone amide protons protonated (Torchia et al., 1988;Lemaster and Richards, 1988). Combining the perdeuteration approach with both the triple-resonance as-130 A. J. Robertson et al.: 4D NOE-NOE spectroscopy for protein assignment and structural analysis signment strategy  and the subsequently introduced powerful TROSY line-narrowing method (Pervushin et al., 1997) made it possible to assign and analyze the structure of quite large proteins, as exemplified by the 723-residue protein malate synthase G (Tugarinov et al., 2002(Tugarinov et al., , 2005a. The sensitivity gained by perdeuteration, enabling the recording of 4D 15 N-separated NOE spectra, also was key in solving the structure of a HIV-1 accessory protein that had been too challenging for analysis by more conventional methods . Zhu and coworkers introduced a TROSY-NOESY-TROSY version of Grzesiek's 4D NOESY experiment which, illustrated for a partially deuterated 27 kDa protein, yielded further improved sensitivity and intrinsic 1 H and 15 N line widths (Xia et al., 2000). Their implementation relied on eight-step phase cycling, thereby limiting digitization of the time-domain data and unable to exploit the improved TROSY relaxation in the indirect dimensions. Diercks et al. (2010) introduced an elegant method to suppress diagonal signals from a 4D TROSY-NOESY-TROSY spectrum (Diercks et al., 2010), a particularly useful feature when spectral resolution is limited. However, we opted not to use this implementation as, at the long mixing times used, the diagonal resonances serve as convenient reference anchors during analysis and are sufficiently attenuated such as not to obscure nearby peaks in our 4D spectra that are of very high digital resolution.
In the present report, we merge the above-mentioned prior advances, 3D NOE-NOE and 3D 15 N-separated NOESY, into a 4D experiment. In combination with extensive perdeuteration and gradient-enhanced encoding to enable a four-step phase cycle as well as non-uniform sampling (NUS) (Rovnyak et al., 2004), the experiments take better advantage of the improved resolution afforded by 4D NMR. We demonstrate the utility of the experiments by applying them to the study of the main protease of SARS-CoV-2 (M pro ), which is the virus responsible for coronavirus-2019 disease . M pro , also known as 3CL pro or Nsp5, is a homodimeric cysteine protease of 2 × 306 residues that does not have closely related mammalian homologues and is therefore an intense target for drug development, with a promising inhibitor now entered in a phase I clinical trial (Boras et al., 2021). Its NMR analysis is challenging, not only for its large size (67.6 kD), but also because of the presence of a minor conformer associated with the cisisomer of one of its 13 Pro residues (P184), the difficulty in back-exchanging all backbone amide protons when the protein is expressed in 2 H 2 O, and the presence of intermediate timescale motions that lead to exchange broadening in the vicinity of the protein's active site. Here we focus on an enzyme variant where the catalytic Cys145 residue has been mutated to Ala (M pro C145A ), a construct that is stable for multiple weeks at the high concentrations required for NMR spectroscopy. The assignment process and a full structural analysis of the protein will be presented elsewhere. The focus of the present work is on technical innovations, includ-ing recording two types of 4D NOE-based NMR spectra that proved invaluable both for the validation of resonance assignments as well as the subsequent structural analysis.

Protein production
The gene encoding a C145A variant of M pro (M pro C145A ) with an N-terminal affinity (and solubility) tag was synthesized by GenScript (USA) and then cloned into a Pet24a+ plasmid between BamH1 and Xho1 restriction sites. The fusion protein encoded for 6His tag -GB1-SG rich linker-TEV cleavage site -M pro C145A and was purified according to methods collectively developed by the COVID-19 NMR consortium (Altincekic, 2021). In brief, following cell culture and harvesting (see Supplement), the cell lysate was passed down a 6His-affinity column (IMAC) and eluted in a small volume; the solubility tag was cleaved off to generate a native N-terminus; the reaction mix was then passed through an IMAC column to remove uncleaved protein before size separation on a Sephadex G75 column. The resulting, extensively perdeuterated, 2 H, 15 N-M pro C145A homodimer was used for experiments and, throughout the recombinant protein expression, extensive care was taken to achieve a high (∼ 98 %) level of deuteration of the non-exchangeable hydrogens. For more details, see the Supplement.

Recording of NMR data
Spectra were acquired on a sample containing 1.8 mM (0.9 mM dimer) 2 H, 15 N-M pro C145A in 10 mM sodium phosphate, pH 7.0, 0.5 mM TCEP, 3 % v/v 2 H 2 O and 0.3 mM sodium trimethylsilylpropanesulfonate (DSS; as an internal chemical shift reference), in a 300 µL Shigemi microcell. All experiments were recorded at 25 • C on an 800 MHz Bruker Avance III spectrometer, equipped with a 5 mm TCI probe containing a triple-axis gradient accessory, and running Top-Spin software version 3.1.
Nonstandard processing was needed for the TROSY-NOESY-TROSY experiment because the spectrum was recorded with sensitivity-enhanced gradient selection in the 15 N t 1 evolution period that preceded the NOE mixing (Xia et al., 2000). Specifically, the 4D NUS data set was first sorted and expanded according to the sampling schedule using the nusExpand.tcl script within the NMRPipe software package (Delaglio et al., 1995). The expanded data were then converted to the NMRPipe format, with the quadrature mode for t 3 set to Echo-AntiEcho, while the quadrature mode for t 1 was temporarily set to Complex. After the conversion, the 4D matrix needs to be transposed to enable use of the NMRPipe macro bruk_ranceA.M to correctly reshuffle the data, turning the phase-modulated t 1 dimension into conventional amplitude-modulated data prior to processing as regular, complex data. This transposition is accomplished by 132 A. J. Robertson et al.: 4D NOE-NOE spectroscopy for protein assignment and structural analysis reading in the NMRPipe-formatted matrix with the z axis along the t 2 dimension, application of the macro, and restoring the data to its original axis order prior to regular processing, with the full script included with the raw time-domain data sets (see Data availability). For the processing, the direct dimension was apodized with a squared, shifted sine-bell window, spanning from 72 to 176.4 • , whereas an additional 15 Hz exponential line broadening was used to better match the apodization window to the natural decay of the signal, thereby improving the signal-to-noise ratio (S/N ). This was followed by zero filling and Fourier transformation. Subsequently, the indirect data points that were not experimentally sampled were reconstructed using the SMILE (Ying et al., 2017) program, and the reconstructed data were further processed in NMRPipe. To enhance the spectral resolution, by default the acquisition times in all indirect dimensions were extended by 50 % during the SMILE reconstruction, leading to an effective sampling sparsity of 0.16 %. The data matrix for the final reconstructed 4D spectrum consists of 614 ( 1 H,  Table S1).
The full time-domain data matrix of the 4D NOESY-NOESY-TROSY experiment (Fig. 1b) consists of 1536* ( 1 H, t 4 , 95.8 ms) × 90* ( 15 N, t 3 , 35.1 ms) × 60* ( 1 H, t 2 , 12.0 ms) × 60* ( 1 H, t 1 , 12.0 ms) complex points (Table S1). An unweighted, random NUS sampling scheme with a sparsity of 1.69 % (corresponding to 43 856 t 4 FIDs) was used to record a small subset of the data points. Using an interscan delay of 1.77 s and four scans per FID, the total experimental time was approximately 110 h, but 3-fold shorter would have sufficed (see Concluding remarks). The data were processed and reconstructed in the same manner as described above with 50 % extension of all indirect dimensions during the SMILE reconstruction, resulting in an effective sparsity of 0.50 % and a final spectral matrix size of 492 ( 1 H, F 4 , 7.8 Hz per point) × 512 ( 15 N, F 3 , 4.7 Hz per point) × 512 ( 1 H, F 2 , 13.9 Hz per point) × 512 ( 1 H, F 1 , 13.9 Hz per point) real points (Table S1). Note that since in the NOESY-NOESY-TROSY experiment the data were recorded using the Echo-AntiEcho mode (Kay et al., 1992) only in the t 3 dimension, immediately preceding acquisition, the bruk_ranceA.M macro was not needed after the conversion of the expanded NUS data. The residual in-phase axial peaks along the F 2 dimension were treated as real peaks and optimally reconstructed by SMILE to suppress the sampling artifacts of the axial signals from spreading to the regions with NOE peaks. The processing macros used for both 4D spectra are included with the raw time-domain data (see Data availability).

Spectrum analysis
Spectra were processed using NMRPipe software (Delaglio et al., 1995); peak picking and spectrum analysis was performed using SPARKY software (Goddard and Kneller, 2008;Lee et al., 2015) as well as NMRDraw (Delaglio et al., 1995). Programs for visualization and analysis were written using freely available python libraries (Hunter, 2007;Harris et al., 2020) as well as NMR-specific python libraries (Helmus and Jaroniec, 2013).

Recording and analysis of the 4D TROSY-NOESY-TROSY spectrum
The rotational correlation time of the C145A variant of M pro (M pro C145A ) at 25 • C is ca. 27 ns, and consequently, transverse relaxation is rapid for both 15 N and 1 H N nuclei. For this reason, it proved beneficial to substitute a TROSY element for the HMQC or HSQC segment that was previously used for such measurements Barnes et al., 2019). Even though the TROSY element only utilizes half of the amide 1 H N magnetization present at the start of the pulse sequence, combining its 15 N evolution with sensitivityenhanced gradient selection during the subsequent t 2 evolution period (Fig. 1a) limits the loss to √ 2 or even somewhat less when taking the gain from the 15 N Boltzmann magnetization into account (Pervushin et al., 1998). A 2D TROSY spectrum (Fig. S1) of this sample allowed identification of 261 backbone amide peaks out of 293 non-proline residues, suggesting the feasibility of implementing the TROSY version of the 4D NOESY experiment. Conformational exchange on a timescale that results in extensive line broadening and incomplete back-exchange of amides when the protein was purified in 1 H 2 O are the primary causes of the absence of the ca. 30 amide signals.
The high quality and S/N of the TROSY-HSQC spectrum (Fig. 2b) suggested the feasibility of implementing the TROSY version of the 4D NOESY experiment. Combined with the enhanced relaxation properties during t 1 and t 2 evolution of the TROSY-selected coherence, we found experimentally that spectral quality attainable for M pro C145A with the 4D TROSY-NOESY-TROSY was better than with the HMQC-NOESY-TROSY version of the experiment, consistent with the previous report that the TROSY implementation improved both the sensitivity and resolution over the 4D HSQC-NOESY-HSQC (Xia et al., 2000). Figure 2 shows expanded regions of six (F 1 , F 2 ) cross sections through the 4D spectrum, each orthogonal to the (F 3 , F 4 ) frequencies of the six amide correlations that are highlighted in a section of the regular 2D 1 H-15 N TROSY-HSQC spectrum of Fig. 2b. A total of 231 peaks, out of the 261 peaks in the 2D TROSY spectrum, can be detected as (semi-)resolvable diagonal peaks in the projected 15 N-1 H (F 3 ,F 4 ) plane (data not shown). These numbers do not include the doubling of resonances associated with isomerization of P184.
The cross sections exemplify the power of 4D analysis for three types of secondary structure: α-helix (Fig. 2c), β-sheet (Fig. 2d), and a loop region (Fig. 2e). Due to the long NOE mixing time used in this experiment (200 ms), substantial spin diffusion occurs, which results in numerous NOE correlations for each amide. For example, α-helical residues L232 and M235 not only show NOE interactions with one another, but also share NOE cross peaks to V233 and A234, with M235 even showing a weak cross peak to N231. Such correlations are particularly useful for validating the assignments obtained from the limited number of triple-resonance backbone assignment experiments that are applicable to larger proteins such as M pro .
The amides of L67 and Q69 in strand β4 only share a single NOE, to sequential residue V68, but they show valuable long-range NOEs to amide protons in strands β1 (C22) and β5 (L75). G195 and D197, located in the long loop that connects strand β13 to helix α6, have an NOE to one another as well as sequential NOEs but show no long-range interactions, consistent with the X-ray structure (Douangamath et al., 2020). However, NOEs from L67 or Q69 to T21 or Q19 are not observed, despite close proximity, due to the minimal back-exchange of amide protons in the β1 strand.
It is interesting to compare the diagonal peak intensities in these various cross sections of the TROSY-NOESY-TROSY spectrum. Diagonal intensity is a function of the amount of amide 1 H z magnetization present at the start of the pulse sequence, i.e., it depends on the non-selective longitudinal relaxation time of the amide proton, but also on the attenuation of this magnetization during the NOE mixing time, in other words, on the selective longitudinal relaxation time which is dominated by J (0) spectral density terms. The latter dominate the differences in diagonal intensity seen in the various cross sections. For example, the helical amides of L232 and M235 rapidly lose their magnetization to their proximate sequential amide neighbors, separated by ca. 2.7 Å, that each are in close contact with other neighboring protons. By contrast, none of the L67, Q69, G195 and D197 amides are closer than 3.7 Å from any neighboring protonated amide in the 1.25 Å X-ray structure of M pro (Douangamath et al., 2020), causing their diagonal intensities to remain high.

Recording and analysis of the 4D NOESY-NOESY-TROSY spectrum
As highlighted by the work of Kaptein and co-workers, 3D NOE-NOE experiments provided an effective method for studying the 1 H-1 H cross-relaxation network in proteins in  Fig. S1). more detail. Here, we extend this powerful experiment to four dimensions, making it more straightforward to analyze such a spectrum while limiting the relaxation pathways by perdeuteration of the protein.
The pulse scheme of this 4D NOESY-NOESY-TROSY is shown in Fig. 1b. It represents a straightforward extension of the original NOE-NOE 3D experiment (Boelens et al., 1989) but with the detection period substituted by the gradient-enhanced 2D 1 H-15 N TROSY scheme (Pervushin et al., 1998). The latter enhances the attainable spectral resolution in the t 3 and t 4 dimensions, while dispersing the detected 1 H N resonances in the 15 N dimension. A number of minor technical considerations are also relevant in this respect.
(1) First, in order to maximize the number of (t 1 , t 2 , t 3 ) data points sampled, the phase cycling of the 4D experiment was reduced to four steps, and the observed spectral window was restricted to the region downfield of the H 2 O resonance. To prevent bleeding in of several weaker imperfectly deuterated aliphatic or exchangeable resonances present in the upfield spectral region, selective-EBURP2 and reverse-EBURP2 pulses (Geen and Freeman, 1991) were used to also restrict the regions where 1 H resonances were excited to those resonating downfield from the water resonance. As a result, no NOE peaks from a few amide protons resonating near water or upfield from water were observed. (2) Recording of a 4D NMR spectrum at adequate resolution requires the use of non-uniform sampling. High quality NUS reconstruction of a 4D NMR spectrum can be accomplished by the SMILE program (Ying et al., 2017) but this as well as most other NUS reconstruction software performs better if the various time domains are acquired in a manner that results in either a 0 • or a 180 • linear phase correction across the spectrum. For this purpose, and to ensure that the non-suppressed axial peaks can be optimally reconstructed, which requires 0 • linear phase correction, it was preferable to insert a non-selective 90 • x 207 • y 90 • x composite 1 H inversion pulse (highlighted as the green open bar in Fig. 1b), followed by a second such pulse that reverses any phase imperfections introduced by the first composite pulse (Hwang et al., 1997). Specifically, the ϕ 1 phase cycling serves to eliminate axial peaks in the t 1 dimension caused by pulse imperfection as well as T 1 relaxation and amide exchange with solvent during T m1 , while also suppressing axial peaks in the t 2 dimension resulting from T 1 relaxation and water exchange during T m2 . To minimize the number of phase cycling steps, ϕ 2 was not phase cycled. However, this resulted in small residual axial peaks along the F 2 dimension caused by pulse imperfections. To ensure that these residual axial peaks were absorptive in the final spectrum, thereby simplifying SMILE NUS reconstruction, an echo is generated by the application of two composite 1 H 180 • pulses in order to suppress initial chemical shift evolution at t 2 = 0, thereby eliminating the need for a linear phase correction. Considering that the real and imaginary components of the residual axial signals have the same amplitude, they result in a 45 • phase error for the axial peaks in the F 2 dimension. Shifting the ϕ 2 phase by −45 • ensures that the NOE and axial peaks both can be phased absorptive using the same phase correction, thus facilitating NUS processing.
Compared to the 4D TROSY-NOESY-TROSY pulse scheme, the 4D NOESY-NOESY-TROSY experiment avoids the lossy magnetization transfer step from 1 H to 15 N and back (leading to a slightly larger number of 241 diagonal peaks on the 15 N-1 H (F 3 , F 4 ) projected plane, compared to 231 for TROSY-NOESY-TROSY). Instead, its magnetization is simply transferred, in part, to its nearest neighbors by cross-relaxation during the first NOE mixing period of duration T m1 = 50 ms. There is virtually no loss in total spin polarization summed over the initial "starting spin", whose t 1 evolution is monitored, and those of its immediate neighbors that are within cross-relaxation contact. As a result, the intrinsic sensitivity of such NOESY-NOESY-TROSY measurements is quite high, allowing the choice of a long duration of 300 ms for the second NOE mixing time, T m2 . During this second, much longer mixing time, the z magnetization distributes over considerable distances due to indirect transfers (Fig. 3). Even in this extensively perdeuterated protein, NOEs to nearly a dozen neighboring protons are observed on the diagonals of the (F 1 , F 2 ) cross sections, taken at the same ( 15 N, 1 H) frequencies used for illustrating the utility of the 4D TROSY-NOESY-TROSY spectrum of Fig. 2. However, as pointed out by Boelens et al. (1989) and Breg et al. (1990), the NOE-NOE combination offers a wealth of new information on the cross-relaxation pathways that led to the longdistance NOEs, substantially aiding both the assignment and analysis of distance information. Below, we briefly highlight a few examples.
As expected, α-helical residue L232 shows intense cross peaks to both of its sequential neighbors, N231 and V233, as well as a weaker cross peak to F230. Despite the relatively short mixing time of only 50 ms that separates t 1 and t 2 evolution, the latter must result mostly from indirect transfer through N231, because N231 and F230 share an intense cross peak. So in effect, each cross section through the 4D spectrum shown in Fig. 3 corresponds to a 2D NOESY spectrum of a small, localized, region within the protein structure -making its analysis far simpler. For residues with few neighbors, direct NOE contacts between neighbors separated by as much as 4.5 Å give rise to quite intense cross peaks after 50 ms NOE mixing, as exemplified by the contacts between G195 and its A194 and T196 neighbors (Fig. 3g). A weaker cross peak between G195 and D197, at an interproton distance of 6.4 Å, appears not to be mediated by spin diffusion because the G195 and D197 panels (Fig. 3g) show no common strong NOE to any visible resonance. However, the possibility that the hydroxyl proton of T196 serves as a relay partner cannot be excluded.
The NOESY-NOESY-TROSY spectrum also shows multiple NOEs to sidechain amide protons that are not visible in the TROSY-NOESY-TROSY spectrum because the TROSY element does not select magnetization transfer for NH 2 groups. For example, D197 shows long-range NOEs to the N133 carboxamide protons, whereas Q69 shows NOEs to both its own carboxamide protons and to those of Q74. The non-equivalent NH 2 pairs are readily recognized by cross peak to diagonal peak intensity ratios that are close to one, owing to their short interproton distance.

Concluding remarks
The spectra shown in this study were recorded during the summer of 2020, when access to campus facilities was strongly restricted due to COVID-19 pandemic mitigation efforts. These restrictions allowed for much lengthier acquisition of spectra than commonly used, for a total of 8 d for the two 4D spectra. As a benefit of NUS reconstruction, it is possible to generate spectra of the same resolution recorded in any fraction of that time. Alternatively, we can discard the data recorded at the longest values of t 1 , t 2 , and t 3 . Indeed, processing the same time-domain data sets but shortening the time domains using a previously described protocol that considers the total normalized length of the 3D (t 1 , t 2 , t 3 ) timedomain vector , using only one-third of the acquired time-domain data yields spectra that are very similar to the ones shown in Figs. 2 and 3, albeit at slightly lower resolution and signal to noise, due to the use of 3-fold less time-domain data. Nevertheless, the quality of the resulting spectra remains excellent, with near-identical information content (Figs. S2 and S3 in the Supplement).
Use of the lengthy data acquisition times needed to collect the 4D spectra requires a high stability sample, which in our case benefited from the C145A active site mutation, protecting the sample from auto-proteolysis. As with all NMR experiments, S/N is approximately proportional to sample concentration. Therefore, working at high concentrations benefits S/N of these experiments that involve multiple magnetization transfer steps, an issue that is particularly important for NOE experiments where magnetization from a single nucleus is distributed over many neighbors.
We note that the TROSY-NOESY-TROSY experiment used a long NOE mixing time of 200 ms, such as to increase the number of observed connectivities by adding indirect NOE effects, including spin diffusion through hydroxyl protons (Koharudin et al., 2003), thereby aiding the assignment process. The use of a 50 ms NOE mixing period in the subsequent 4D NOESY-NOESY-TROSY experiment then provided a semi-quantitative measure of distance between these protons and their neighbors. Indeed, as pointed out by Kaptein and co-workers, recording of NOE-NOE spectra provides important experimental data on the pathway of magnetization transfer during NOE mixing. Such information could be used to convert these data into more quantitative distance information than the typical qualitative analysis of NOE intensities, potentially leading to the generation of higher resolution structures (Vogeli et al., 2009(Vogeli et al., , 2012. Quantitative NOE interpretation traditionally relied on the recording of a series of NOE buildup data, which can become comparably time-consuming as the recording of 4D NMR spectra if resonance overlap is a limiting factor, as typically is the case for NOE spectra. This problem is further exacerbated by the spectral crowding of large proteins, particularly in the 1 H dimension, and while 3D spectra may give higher signal-to-noise ratios than 4D spectra, downstream analysis frequently requires extensive disambiguation of overlapped peaks. Our study of M pro C145A shows that a large number of semi-quantitative NOE distances become accessible by recording of 4D NMR spectra on a perdeuterated larger protein with little or no ambiguity about the nuclei involved. While the high signal to noise and spectral simplicity of working with perdeuterated proteins has long been recognized (Torchia et al., 1988;Lemaster and Richards, 1988;Tugarinov et al., 2004) the number of structural restraints accessible used to be small. Our present study demonstrates that a much larger number of NOE interactions becomes available by the recording of 4D NOE spectra. Moreover, it highlights the exquisite detail and value of NOE-NOE interaction analysis explored by the Kaptein group and it demonstrates that this approach is highly suitable for the larger biomolecules and biomolecular complexes being explored today, in particular when using extensive perdeuteration. Therefore, we believe that the recording of high quality 4D NMR spectra of the type presented in this study is entirely practical and invaluable for the structural and functional analysis of large proteins and their complexes, with possible extension to the study of nucleic acids. We note, however, that in the absence of extensive deuteration the dilution of nuclear magnetization over sidechain resonances will strongly lower the sensitivity of the experiment, which is further exacerbated by decreased effectiveness of TROSY-based line narrowing in such samples. On the other hand, adaptations of the NOESY-NOESY-TROSY experi-ment to methyl-protonated but otherwise perdeuterated proteins (Tugarinov et al., 2005b) are expected to be readily feasible.
Data availability. The raw Bruker NMR data sets including the acquisition parameters and NUS sampling lists, pulse programs, include file, and NMRPipe processing scripts are available for download from Zenodo: https://zenodo.org/record/4625615 (Robertson et al., 2021).
Author contributions. AJR expressed and purified protein samples, collected and analyzed the data, and edited the manuscript; JY optimized pulse sequence parameterization and processing and edited the manuscript; AB supervised the project and wrote the manuscript.
Competing interests. The authors declare that they have no conflict of interest.

Special issue statement.
This article is part of the special issue "Robert Kaptein Festschrift". It is not associated with a conference.