Insights into Protein Dynamics from 15 N-1 H HSQC

Protein dynamic information is customarily extracted from 15N NMR spin-relaxation experiments. These experiments can only be applied to (small) proteins that can be dissolved to high concentrations. However, most proteins of interest to the biochemical and biomedical community are large and relatively insoluble. These proteins often have functional 10 conformational changes, and it is particularly regretful that these processes cannot be supplemented by dynamical information from NMR. We ask here whether (some) dynamic information can be obtained from the 1H line widths in 15N-1H HSQC spectra. Such spectra are widely available, also for larger proteins. We developed computer programs to predict amide proton line widths from (crystal) structures. We aim to answer the following basic questions: is the 1H linewidth of a HSQC cross peak smaller than average because its 1H nucleus has few dipolar neighbors, or because the resonance is motionally narrowed? 15 Is a broad line broad because of conformational exchange, or because the 1H nucleus resides in a dense proton environment? We calibrate our programs by comparing computational and experimental results for GB1 (58 residues). We deduce that GB1 has low average 1HN order parameters (0.8), in broad agreement with what was found by others from 15N relaxation experiments (Idiyatullin et al., 2003). We apply the program to the BPTI crystal structure and compare the results with a 15N1H HSQC spectrum of BPTI (56 residues) and identify a cluster of conformationally broadened 1HN resonances that belong to 20 an area, for which millisecond dynamics has been previously reported from 15N relaxation data (Szyperski et al., J. Biomol. NMR 3, 151-164, 1993). We feel that our computational approach is useful to glean insights into the dynamical properties of larger biomolecules for which high-quality 15N relaxation data cannot be recorded. 25 https://doi.org/10.5194/mr-2021-30 Discussions O en A cc es s Preprint. Discussion started: 9 April 2021 c © Author(s) 2021. CC BY 4.0 License.


60
Let us take a look at the 15 N-1 H HSQC spectrum of GB1 (Figure 1). We take this protein as our "calibration" case.
The assignments are available at the Biological Magnetic Resonance Bank, while high-resolution crystal structures are available in the Protein Data Bank. The intensities of the cross peaks in the spectrum vary by a factor of 3 and the 1 HN 65 linewidths vary by a factor 2 (see Table S1 in the Supplemental Materials). The spectrum shows several doublets (e.g. N8), due to the 3JHNHA coupling. What causes the variation in 1 HN line width? There are several possibilities. First, (unresolved) 3JHNHA contributes to the measured linewidth. Second, the dipolar proton environment of each amide proton varies. This causes differences in relaxation rates, R2 1 HN-1 HX. Third, anisotropic rotational diffusion may cause differences in the line widths. Fourth, the amide proton resonances could be life-time broadened by mass exchange with water. Fitfh, last but not 70 least, local dynamics could affect the line widths, either by narrowing (fast local dynamics) or broadening (conformational exchange dynamics on the ms -ms timescale), e.g. di-sulfide isomerization, or general conformational flexibility.
We will address these points one by one. Can anisotropic rotational diffusion cause the line width differences? GB1 is an ellipsoid with an long/short axis ratio of 1.8 (Idiyatullin et al., 2003). We calculate from the classical Woessner equations (Woessner, 1962) that R2 for the 1 HN-15 N dipolar interaction varies +/-12 % when considering different angles from a 75 relaxation vector to the diffusion axes. But that is for individual relaxation vectors -the 1 HN-1 HX relaxation vectors contributing to the dipolar R2 relaxation of a particular 1 HN point in different directions; so, in practice, the small orientational effects will mostly cancel.
The intrinsic (unprotected) amide proton exchange rate is given by the empirical relation (Englander et al., 1972) : where T is in 0 C and kex in min -1 .
From the experimental parameters of the spectrum (3 o C, pH 6.5) we calculate from Eq [1] a 0.26 s -1 exchange rate, giving rise to a 0.1 Hz life-time broadening for unprotected amide proton resonances. Amide protons engaged in H-bonds within the protein will exchange much slower, with even less broadening. We find that variation in amide proton exchange is not 85 significant for this spectrum. k ex = ln 2 200 10 pH −3 + 10 3− pH ⎡ ⎣ ⎤ ⎦ × 10 0.05T Figure 1. Section of the 600 MHz 15 N-1 H HSQC spectrum of GB1 at 3 0 C, pH 6.5.
Processed with 1 Hz EM in t2, cos 2 in t1 . The assignments were taken from BMRB entries 7280, 25909 and 26716.
The numbering is as in PDB entry 6c9o.pdb. The cross sections show the Gaussian line shape fits as carried out by Sparky (Goddard and Kneller, 2000) https://doi. org/10.5194/mr-2021-30 Discussions Open Access Preprint. Discussion started: 9 April 2021 c Author(s) 2021. CC BY 4.0 License.
In principle, the specific proton environment of each amide proton is known from the (high resolution) structure of GB1.
Hence, the dipolar R2 relaxation of 1 HN due to its surrounding protons can be calculated, given the three-dimensional structure.
The (unresolved) scalar coupling is also knowable and can be obtained from the structure as well, using the 90 following Karplus equation (Lee et al., 2015) [2] where is the dihedral angle spanned by C'-N-Ca-C'.

95
Summarizing, we have a handle on variables 1 through 4 that affect the 1 HN line width. Hence, making a calculation of these variables, and comparing the resulting calculated linewidth with the experimental (reduced, see below) linewidth, should uncover the presence (or not) of the dynamic properties of the protein, in a sequence-specific fashion.

Theory of R2 1 HN-1 HX relaxation
Measuring R2 relaxation in proteins has been taken on by Bodenhausen and co-workers (Boulat and Bodenhausen, 1993) (Segawa and Bodenhausen, 2013). Their work has been focusing on obtaining "pure" R2 rates for resolved 1 H resonances in 1D NMR spectra, using extremely selective pulses and selective spinlocks. As far as we know, they have not published a 105 linewidth fit of a complete HSQC spectrum, as we are trying to do here.
As virtually all 1 H resonances are resolved in the small proteins GB1 and BPTI, the pure 1 HN-1 HX R2 relaxation rate as measured from the cross peaks in a 15 N-1 H HSQC is given by the R2 relaxation rate for un-like spins (Goldman, 1988) [3] 110 where is the permittivity of space, are the gyromagnetic ratios, Planck's constant divided by 2 , the resonance frequency, and the rotational correlation time.
Exceptions to equation [3] will arise when the interacting protons have identical chemical shifts (within linewidth). So this may happen for a few 1 HN-1 HN dipolar pairs. In that case one should use the "identical" equation (Goldman, 1988).
[4] 115 An equation describing a smooth transition between identical and non-identical spins has been given by (Goldman, 1988).
However, it is often difficult to measure a "pure" R2 for HN, because it is affected by anti-phase relaxation due to the scalar-coupled 1 HA (Peng and Wagner, 1992): 120 and therefore, one measures where fIP is the fraction of time when the coherence is in-phase. In the case of the GB1 HSQC spectrum, where we used a 227 ms t2 acquisition period, we calculate that for 3JHNHA > 3 Hz , is close to 0.5. Hence the R1 rate of the 1 HA does (1− f IP ) come into play up to 50% and affects the 1 HN linewidth. The key question is now whether the HA R1 rate is the "selective" or "unselective" R1. For macromolecules, the difference between the rates is very large (~10 s -1 for selective, ~0.2 s -1 for 130 unselective).
The literature is unanimous to state that it is the fast "selective" rate. In the case of (Boulat and Bodenhausen, 1993)'s careful 1 H R2 measurements, this is certainly the case: they employ extremely selective pulses that excite a single 1 H resonance in the NMR spectrum. The scalar coupling, if allowed to develop, then affects the z-magnetization of just one other proton.
All other 1 H spins are unperturbed as serve as a cross-relaxation bath for that latter proton, driven by the large spectral density 135 term J(0) (actually ) (Goldman, 1988).: But our case is different. For the 1 HN linewidth in a HSQC, we need to consider the magnetic environment of the scalar 140 coupled 1 HA during the FID. In fact, all 1 H magnetizations are in the xy plane, and thus saturated, by the last 1 H 90 degree pulse. Hence the relaxation bath is "hot" and cross relaxation between the 1 HA and other protein protons is expected to be slow. We should thus expect that the unselective, slow, 1 HA R1 rate is at play in this case (Goldman, 1988).: [8] This argument may need refinement. After all, the fast-HSQC experiment we employed, was designed to return the 145 water magnetization to +z during the FID. Therefore, most of the 1 HA magnetizations will also be in +z and their R1 rate will be affected by other aliphatic protons that are saturated. All in all, the situation can become dependent on the detail of the experiments and chemical shift distributions. In the Results section we will employ a "see what fits best" approach.
The entire in-phase / anti-phase and ensuing "selective" / unselective issues is of course moot when one studies 150 perdeuterated proteins. But there are other methods to avoid the problem. Methods to measure pure R2 in the presence scalar couplings have been developed by Bodenhausen and co-workers (Boulat and Bodenhausen, 1993) (Segawa and Bodenhausen, 2013), and by Morris and co-workers (Aguilar et al., 2012). In the original Bodenhausen approach, one obtains a pure HN R2 when selectively exciting a single amide proton resonance, and selectively spin locking it with a very weak r.f. field.
The relaxation rate of that spin-locked resonance is unambiguously described by (Brüschweiler, 1991): 155 [9] Here, , where is the offfeset between the r.f. carrier and the resonance frequency.
Offset issues also lead to (Massi et al., 2004) [10] where . (In the Bodenhausen expriment, the r.f. carrier is on the locked resonance, and off-resonance 160 effects do not come into play).
However, this experiment is rather impractical for proteins; there is not enough amide proton resolution in the spectrum of even a small protein such as GB1 to select more than a few resonances individually. We are therefore using a variation of this approach. As suggested by Dr. G. Morris (Manchester), we selectively excite all amides, and spin lock them at high power. This avoids the issues with the scalar coupling. Also, because , equation [9] becomes identical 165 to equation [3]. Regretfully, it brings in a new issue: are the spin-locked protons relaxing according to the "like" or "unlike" equations? Common wisdom is that since the aliphatic protons are not-spin-locked, amides and aliphatics are "unlike" and follow equation [9]. We had some doubt and elected to test this experimentally. We compared the apparent R1rho rate for the amide of F52 (9.5 ppm), between two spin-lock fields at 5 kHz and 500 Hz. According to the structure, the dipolar relaxation of this amide is dominated by the HA of residues 52 and 51. Hence the 5 kHz r.f. field "hits" those HA whereas the 500 Hz 170 r.f. field does not. And yet, the T1rho is identical for both locking fields (see Figure 2) with the same factor of 2.0 drop in intensity between 50 and 75 ms of spinlocking (serendipitous). This convinced us that the excitation profile, and not the locking field strength., determines what are "like" or "unlike" spins. For those amide protons, for which dipolar partners such as other amide protons, aromatic ring protons, Gln, Asn and Arg sidechain protons are also selected and properly spin-locked, the relaxation will be given by a "like" spin equation (Bothner-By et al., 1984) 185 [11] With this reduces to the "like" spin relaxation rate [5].  Hence, one can predict with great certainty which equation should be used for which spin-pair. Furthermore, the spin-lock virtually eliminates broadening by conformational exchange, unless being in the kHz range, becomes in "resonance" with conformational/chemical exchange processes at the same timescale. 190 All in all, the theory behind the T1rho experiment is more robust than the theory behind the cross peak linewidth. Therefore, we will use the T1rho experiment and calculations as our departure point for further calculations.
Even in small proteins, many protons interact magnetically. In our programs, described in the appendix, we find typically that 40 protons are present in a 6 Å sphere around an amide proton. All R2 or R1rho relaxation rates of these N other 195 protons j for an amide proton i will co-add if the relaxation vectors ij diffuse independently from each other: Taking a larger sphere of interacting protons does not significantly change the summation (see Table S3 in the Supplemental   materials) 200 Obviously, the assumption underlying equation [12] cannot be correct, because the interacting protons in a protein are not diffusing independently. One has to consider dipole-dipole cross-correlated R2 relaxation (also called relaxation interference).
However, we can show that relaxation interference is almost completely canceled in multi-spin systems, and can be neglected as a source for large deviations of equation [12] (see Appendix). 205

3.Results and Discussion
Our experimental / computational approach is the following. First we analyze the T1rho experiments for GB1, and extract the effective rotational correlation time; then we use that correlation time to analyze the GB1 amide proton linewidth and decide, experimentally, whether the R1 relaxation rate for 1 HA contributing to the effective R2 relaxation rate of 1 HN in 210 the HSQC, Equation [6] is (closer) to the "selective" or "unselective" R1.
Subsequently, we use what we have learned from GB1 to calculate the HSQC linewidths for BPTI, and analyze the results for dynamical content, and compare with the literature.

Calibration: GB1 T1rho 215
For GB1, we collected not only the HSQC spectrum of Figure 1, but also a series of spectra, in which we aim to measure the 1 HN R1rho rate by using an amide-selective excitation pulse followed by a fairly strong spinlock field ( = 5.3 KHz ) of varying durations, followed by a HSQC read-out. In this experiment, called semi-selective T1rho-HSQC, the 3 JHNHa scalar coupling is suppressed. The pulse sequence and the relaxation data obtained are shown in the supplemental materials.
With few exceptions, the relaxation data can be fitted with a single exponential with a R 2 > 0.95. 220 For the computations, there are several high-resolution crystal structures 6c9o (V29SeM; 1.2 Å resolution), 6che (A34Sem; 1.1 Å ), 6cne (V29SeM; 1.2 Å ) and 6cpz (I6Sem; 1.12 Å). SeM is seleno-methionine. Inspection of the structures suggests that the mutation at I6 is the least intrusive on the structure. This mutant is a dimer in the crystal. We use only chain "A" for our calculations. The results for chain "B" are not significantly different. The proton coordinates were added by the routine Molprobity (Williams et al., 2018). 225 The effective rotational correlation time for the GB1, which is a prolate ellipsoid, was determined to be 6.5 ns from an extensive analysis of 15 N relaxation data acquired at 5 0 C (Idiyatullin et al., 2003). Using the empirical relation from (Daragan and Mayo, 1997) we extrapolate from the experimental data at 5 0 C that = 7 ns at 3 0 C. When using the same empirical equation directly, we find for GB1 that = 8.9 ns at 3 0 C (see also Appendix).
At the outset, we note that by fitting the experimental and calculated T1rho data, we can only determine an effective rotational 230 correlation time, which is the product where the brackets indicate average over residues, and is an average order parameter for each 1 HN describing the motions of 1 HN-1 HX relaxation vectors (in terms of both distance and angular fluctuations).
As a start, we used the 7 ns effective correlation time to compute R1rho for GB1. The 1 HN R1rho rates due to 6 Å sphere of protons around it, were calculated from equation [9] or [11], depending on whether the other spin was spin locked or not. 235 Because we needed not to concern ourselfs with offset effects (see Eq [10]). The comparison between experimental and calculated R1rho rates is shown in Figure 3. One sees that the range and median of the computed and experimental R1rho correspond very well, indicating that we have chosen the correct effective rotational correlation time for the calculations. Actually, we optimized the effective correlation time, by minimizing the RMSD between 240 measured and calculated R1rho. At this point, it is important to recall that we can only optimize the effective correlation time; it may be well the theoretical 8.9 ns multiplied by an of 0.79. Indeed, one could do more experiments e.g. the analysis of 15 N relaxation, to get a better handle on that correlation time, but, as explained in the introduction, we would like to find a way to obtain dynamical information without such experiments. While the range of calculated and experimental R1rho rates correspond well, the actual correlation is poor. For calculated data points larger than the experimental ones can be explained 245 by low order parameters. Experimental data points larger than computed ones are harder to explain. In T1rho the latter cannot be, in general, caused exchange broadening, as it is suppressed by the spin lock.
For now we chose not to be concerned by these issues as we want to use the experiment to help us decide the (effective) rotational correlation time.

Calibration: GB1 HSQC
With the R1rho data analyzed and the effective correlation time estimated, we turn our attention to the HSQC spectrum 255 itself. The HSQC spectrum in Figure 1, processed with 1 Hz exponential window in t2, shows many of the 3 JHNHA scalar couplings, which closely correspond to the scalar couplings we compute from the crystal structure using the Karplus equation [2]. The Sparky software (Goddard and Kneller, 2000) does not fit resolved doublets as a pair, but as a single resonance, as shown in the cross sections shown in Figure 1. The fits, using a Gaussian lineshape, were individually inspected and found to be excellent, with an estimated uncertainty of less than a 1 Hz. 260 Before we can make comparisons between experimental R2 and computed R2 data, we have to correct the measured 1 HN line widths for several effects. First, we subtract the computed scalar couplings. Second, we need to consider 1 HN dipolar relaxation due to the amide nitrogen, chemical shift anisotropy relaxation and field inhomogeneity. We calculate that for =7 ns, the 1 HN dipolar interaction with 15 N accounts for ~2 Hz, that the 1 H CSA contributes ~ 0.2 Hz at 600 MHz (using CSA values from (Loth et al., 2005)), while field inhomogeneity typically is limited to 1 Hz. We decoupled 13 CO during data 265 acquisition, but the 2 JHNCA of 2 Hz (Schmidt et al., 2011) should also contribute to the 1 HN linewidth. We thus are inclined to subtract 5 Hz from the apparent experimental 1 HN line widths in addition to the 1 Hz due to the window function (total 6 Hz).
What is left is what we call the "reduced experimental line width", which should consist of just the sum of the 1 HN-1 HX dipolar line widths, affected by antiphase relaxation due to the 3 JHNHA and potentially affected by the fast and/or slow dynamics we try to uncover. However, we find, by simulation, that unresolved scalar couplings add a full 1 Hz less to the linewidth than 270 expected. When assuming that the instrument was well-shimmed, it is thus reasonable to estimate that one should subtract just 2-3 Hz from the observed linewidths. We will determine what is best from the calculations.

285
In Figure 4A we show the results of these calculations. For panel A, we subtracted 3 Hz from the experimental linewidth (in addition to 3 JHNHA) as outlined above. It is clear that, when doing that, the linewidths calculated taking the unselective HA R1 into account (filled circles), are on average too large (RMSD 1.91Hz). It is also clear that the calculations using the selective HA R1 into account turn out worse ( open circles; RMSD 3.00). For Figure 4B , we subtracted just 2 Hz from the experimental linewidth (in addition to 3 JHNHA). Now the median value of the "unselective" calculation corresponds 290 better to that of the reduced experimental values and we obtain a RMSD of 1.75 Hz. For the "selective" calculation we obtain RMSD 2.29 Hz. This is what we set out to resolve. The theory (and my consultancies with several colleagues) does not establish unambiguously which HA R1 rate is to be used; but the comparison between experiment and calculation does. Clearly, we For the filled circles, we used the "unselective" 1 HA R1 (Eq 8 need the "unselective" HA R1 in this experiment. In practice, this rate is so small, that the anti-phase relaxation is within a 295 few tenths of s -1 within the in-phase rate. Hence, we do not need to worry about in-phase / anti-phase issues. The situation will be different when using a HSQC pulse sequence using selective amide proton pulses throughout , not exciting anything else (Gal et al., 2007) . In that case, the scalar-coupling-induced HA z-magnetization perturbation during the FID would be in an unperturbed aliphatic spin bath, and the fast "selective" HA R1 would be in effect.

305
As explained before, we cannot determine a real correlation time from this fitting procedure. Rather we determine the product to be 7 ns. If we take = 8.9 ns at 3 0 C as calculated from the empirical relation (Daragan and Mayo, 1997) for For the filled circles, we used the "unselective" 1 HA R1 (Eq 8 , we would obtain that equals 0.79. Parenthetically, we note that we have independently carried out an analysis of protein rotational correlation times as available in the literature, and found those to closely follow the empirical relation 310 (Daragan and Mayo, 1997) (see appendix).
How does the estimated compare with literature values? Obviously, there are no such values determined, but there is an comprehensive paper of (Idiyatullin et al., 2003) calculating 15 N order parameters using several different approaches.
Using the extended Modelfree method (Clore et al., 1990), they obtain an average order parameter of 0.70, while using their own method, they obtain an average of 0.62. From this it is suggested that GB1 is a rather dynamical molecule, and one may 315 argue that one sees that back in the 1 HN relaxation as well. However, we note that (Idiyatullin et al., 2003) obtain a rotational correlation time for the GB1 prolate ellipsoid of 6.5 ns from the 15 N relaxation data acquired at 5 0 C, while the empirical relation from the same lab would predict 8.6 ns. Even so, both methods point to a quite dynamic GB1 protein, even at temperatures as low as 3 0C.
In summary, for GB1, we have established by comparing T1rho and HSQC linewidth data, that one can compute a 320 reasonable range of 1 HN R2 rates using just a single equation for unlike spins. When assuming a rotational correlation time as predicted by the literature, we predict from comparing the calculated and experimental 1 HN linewidths that the average 1 HN-1 HA order parameter must be 0.79, indicating much internal motion. Others have come to similar conclusions analyzing 15 N relaxation data (Idiyatullin et al., 2003). With this, we have arrived at our goal: we show that we can extract motional information from the 1 HN linewidths in a HSQC spectrum by making simple calculations based on a crystal structure. In the 325 case of GB1 this is overall motional narrowing. In BPTI, as we will see below, we can also extract conformational exchange line broadening that is not immediately apparent from the spectrum itself. is available in the Protein Data Bank. We processed the time-domain data with a 1 Hz exponential window in t2. At the contour level of Figure 5, several peaks are missing, indicating intensity dispersion. Table S2 shows a 30-fold range for S/N and an 8-340 fold range for the linewidth, much more than for GB1. Significantly, and in contrast to GB1 at 3 0 C, this processed spectrum does not show any resolved 3 JHNHA couplings at 30 0 C. Nevertheless, the protein is not perdeuterated. According to three sources, the correlation time should be between 2.5 and 3.5 ns, much less than the 7 ns correlation time of GB1 at 3 0 C. (Beeser et al., 1997), (Daragan and Mayo, 1997) and (Sareth et al., 2000). Why are these doublets missing? From Equation [1]we calculate a 1.15 s -1 mass exchange rate, giving rise to maximum broadening of ~ 0.4 Hz for unprotected amide proton 345 resonances, so that cannot be the reason. We can, a priori, already assume that the sample must have been aggregated. Indeed, MRD studies suggest that, at high concentrations, BPTI can form decamers in solution (Gottschalk et al., 2003). As it turns out, this sample of BPTI mimics a much larger molecule, and provides an excellent opportunity to test our method on a "large" protein.
In Figure 6, we show the comparison of experimental and calculated 1 HN linewidths for BPTI. We subtracted, besides 350 the calculated scalar couplings, 2 Hz from the linewidths as reported by Sparky ( 15 N-H dipolar, 1 HN CSA and shimming). Three of five excessively broadened resonances belong to region in the left bottom of protein in Figure 7A. This protein area 375 comprises two anti-parallel beta strands with residues 10-15 and 36-40, and harbors the Cys14 -Cys38 disulfide. Returning to Figure 6, one would be hard-pressed to declare the orange points as exchange broadened or not. They are on the edge of the bulk of the distribution. But the calculated data for these points is just 6 Hz; this helps in deciding the matter, suggesting that the experimental data are exchange broadened. The two orange points correspond to A16 and A40, which, significantly, are in the same area where the red points are in Figure 7A. From the five yellow points in Figure 6 three map in the same area 380 again (10, 12, 38), one is residue Phe4 next to the excessively broad Asp3 (bottom right of Figure 7A). This broadening is likely amine-catalyzed amide proton mass exchange life-time broadening which is so often seen for the N-terminal 3 to 4 residues in proteins. The last yellow point at 6.8 ppm belongs to N44.
The majority of the broadened 1 HN resonances belonging to the red, orange and yellow data points in Figure 6B  to a useable results. But there is more. In early work, (Szyperski et al., 1993) detected 15 N exchange broadening for residues 14-16 and 38-39 in BPTI. Our current calculations (Figures 6 and 7) point to exactly the same area. (Szyperski et al., 1993) suggest that the 15 N exchange broadening is a result of the Cys14 -Cys38 disulfide isomerization at a stochastic rate of 500 s -1 and a superposed conformational process of the entire area with a stochastic rate > 10,000 s -1 . If we assume that the changes in chemical shift associated with these conformational changes are the same for 15 NH and 1 HN in terms of ppm, we would 390 expect the 500 s -1 process to give rise to slow exchange or resonance doubling in 1 H, which is not observed. Hence it is likely that the 1 HN line widths are sensitive to the faster process.
The effect of mutations on the 15 N relaxation of BPTI has also been studied. According to (Beeser et al., 1997), fast and slow dynamics is mostly absent in wt-BPTI, with order parameters between 0.8 and 0.9 (except for the C-terminus) and very little exchange broadening (~ 1Hz), except for two areas (again) 14-15 and 38-40. (See Figure 4A of (Beeser et al., 1997)). This is 395 in agreement with the data of (Szyperski et al., 1993). The effect of the mutation Tyr35Gly on 15 N the relaxation parameters was also studied; it exacerbates the broadening in magnitude (up to ~ 3 Hz) and extent (comprising residues 10-20 and 32-43), (see Figure 4B in (Beeser et al., 1997)). We show Tyr35 in Figure 7. Interestingly, our 1 H dynamic results also comprise that same extended area; it thus seems that the 1 HN resonances are sensitive to extended dynamical processes present in the wild-type protein, that are only observable by 15 N relaxation after a (predictably) destabilizing mutation. 400 There are several points that are calculated to be significantly broader than the experiment; these data points are shown in magenta. They comprise I9, L29, R42 and A58. Such outliers would indicate resonances that according to the crystal structure coordinates should be broad (a dense proton environment), but are not broad in the experimental data. That would suggest fast local motion. The 15 N order parameter for Ala58 is small ((Beeser et al., 1997)), which would corroborate the finding here. However, the 15 N-relaxation data does not show reduced order parameters for I9, L29 and R42. This maybe a 405 genuine difference in 15 N and 1 HN order parameters, or noise. We also have no quick rationale for the experimental broadening for Lys 46 (red, on the top-middle of Figure 7A) and the calculated rate for Asn 44 (yellow, just below it). Previous work 15 N relaxation work does not show anything particular for these residues. But, the resonance of K46 is actually missing at the contour level of the spectrum in Figure 5; so there is no question that something is going on there. We may speculate that small motions of the ring of Phe45, which hovers above amides 44 and 46, can translate in changing ring-current shifts, causing 410 broadening for these resonances, which is not due to an actual spatial change at the level of the amides themselves. A ringcurrent-driven mechanism could be consistent with the lack of broadening effects in the 15 N spectral data: ring current effects are, as expressed in Hz, 10 times larger for 1 H than for 15 N . Hence, varying ring current shifts are apt to cause much more "conformational" exchange broadening for 1 HN than for 15 NH. Last, there are three points Y21, G36 and G57, colored green.
The calculations identify them as broad as well, suggesting that the broad linewidth is intrinsic. According to the 15 N data, no 415 broadening is occurring for those residues either. It is significant for assessing the value of our calculations, that the two rightmost green points in Figure 6B (G36 and G57) would be identified as exchange broadened from the experimental 1 HN linewidth distribution, but that the calculations indicate that they are not. But how do we do with our calculations within the bulk of the distribution? Figure 6B shows that it is not good at all and not much worse than for GB1 ( Figure 4B). There is hardly a correlation between experiment and calculation -the 420 calculated values all lie around 7 Hz, while the experimental values vary almost a factor of two. At the moment we have no explanation for this, merely suggesting that there is much room for improvement, likely in measuring / calculating the 1 HN-1 HX order parameters. We note that we used a BPTI crystal structure with 1.2 Å resolution, which has a coordinate precision of ~ 0.2 Å (DePristo et al., 2004). This can give rise to considerable errors in R2 calculations. For example, a HN(i) to HA(i-1) distance of nominally 2.2 Å in a beta sheet structure (Wüthrich, 1986) may be incorrect by 9%, and produce a 68% error 425 for the R2 relaxation contribution. There is work to be done here, for sure. Nevertheless, just considering the outliers of the distribution appears to result in relevant dynamical information.

Conclusion
We developed computer programs to predict amide proton line widths from (crystal) structures. We calibrate our programs by comparing computational and experimental results for GB1, using 15 N-1 H HSQC and semi-selective T1rho experiments. We find that we can predict the correct range of 1 HN R2 relaxation rates from a crystal structure using a Karplus equation and a program based on just one relaxation equation. We deduce that GB1 has fairly low average 1 HN order parameters (0.8), in 435 broad agreement with what was found by others from 15 N relaxation experiments. We apply the program to the BPTI crystal structure and compare the results with a 15 N-1 H HSQC spectrum of BPTI. After adjusting the correlation time, we find from the outliers in the distribution a cluster of conformationally broadened 1 HN resonances that belong to an area for which broadened 15 NH resonances have been previously reported. Thus, our approach can yield important dynamical data. We feel that this approach may be useful to glean insights into the dynamical properties of larger biomolecules for which high-quality 440 15 N relaxation data cannot be recorded. The semi-selective T1rho experiments are also not difficult to perform and are also suitable for application to larger molecules. Comparing these relaxation data (T1rho and the 1 HN linewidth) for proteins in different states or complexation forms, is likely much more interesting. Perhaps the dynamical differences can be tied to functional properties, as has been carried before for small proteins, but much less so for the larger ones. The theory of 1 HN R2 for proteins is not iron-clad; issues such as "like" and "unlike" R2, "in-phase/antiphase" relaxation, "selective" and 445 "unselective" R1 rates and cross-correlated R2 relaxation all play roles in these issues. As a by-product of our "calibration" work for GB1, we help resolve most of those (sometimes contentious) issues.

Acknowledgements 450
Dr. Gottfried Otting has graciously provided me with remote instrument time at the Canberra NMR facility, and the use of his sample of 13 C, 15 N labeled GB1. I thank Professor Arno Kentgens (Nijmegen) for helpful discussions on solid state NMR powder patterns. I have taken advice from Professors Raphael Bruschweiler (Ohio State), Gareth Morris (Manchester) and Drs. Ad Bax (Bethesda), Bernhard Brutscher (Grenoble) and Anaya Majumdar (Baltimore).

Code/Data availability.
The Fortran90 computer codes are available from the author and will be deposited at https:://github.com.
The (Bruker) data directories for the GB1 experiments will be deposited at the Biological Magnetic Resonance Bank.

Author contributions. 460
ERPZ conceived and wrote the paper. He wrote all computer codes and remotely carried out the GB1 NMR experiments.

Conflicting interests
None.
The program requests a PBD file for which protons are available. The program makes an internal copy of the pdb file. It requests the radius of the sphere of protons to consider, the rotational correlation time, and the spectrometer. The code consists of two loops; the outer loop advances over the amide protons one by one. The inner loop scans the copy of the coordinates and 570 finds all protons (including HN) around the HN at hand for the radius defined. It co-adds all R2 rates according to Equation 3 and 12.
The program calculating the 1 HA R1 and 1 HN R1 rates are almost identical to the R2 program, with changed equations.
The linear combinations of R2(HN) and R1(HA) , as governed by Equation 6, was carried out in a spreadsheet, taking into account different amounts of in-phase / anti-phase admixtures based on the integration of model computations with different 575 scalar couplings and acquisition times.
All programs are written in Fortran90 , and contain no references to outside libraries. The source codes are available from the author and will be deposited at https://github.com.

Computer program based on Eqs. 12 and 13 (below). 580
Proton-proton cross-correlated R2 relaxation between just two dipolar vectors ij and ik is, adapted from (Goldman, 1984) and (Fischer et al., 1998) [13] where is the angle between the two vectors ij and ik. 585 The total R2 relaxation for proton i is then given by [14] However, these individual line widths can only be observed if the transitions for the Hi multiplet are resolved by J-coupling 590 (and/or residual static dipolar coupling). For amide protons, this will not be the case, and one expects an inhomogeneous line consisting of the superposition of many narrow and broader Lorentzian lines corresponding to a multi-spin expansion of Eq To my knowledge, there is no closed equation describing R2 cross-correlated relaxation for more than two dipolar vectors.
To arrive at an estimation for the effects in a multi-proton spin system, we start from a "solid state NMR" point of view. We 595 calculate , the net local magnetic field at center proton i due to the surrounding protons j (Slichter, 1992) in certain orientation of the magnetic field with respect of the molecule: [15] Here, is the angle between the internuclear vector ij and the magnetic field direction in the molecular frame.
represents a certain configuration of the signs of the surrounding dipoles j. For instance, for 10 protons one has 1024 different 600 configurations. If one varies the magnetic field direction according to a sphere distributions and adds the results one obtains the cross-correlated powder pattern for that value of . Subsequently one co-adds all powder patterns for different values of , and normalizes, to arrive at the "cross correlated" dipolar powder pattern for the 1 HN under consideration.
It is the time-dependence of Bloc as caused by molecular motion that drives the solution NMR dipolar relaxation. The R2 relaxation is then obtained as the second moment of the (cross-correlated) powder pattern (Slichter, 1992): 605 [16] where the brackets indicate average. The computer program requires as input a "protonated" PDB file (HN for amides), the radius of the sphere of protons 610 around the amide protons, the rotational correlation time and the spectrometer frequency. Basically, the program consists of four nested loops: amides, protons around amides, permutation of dipole signs of these surrounding protons, and rotation of the magnetic field vector in the molecular frame.
A set of 10 nested loops permutes the dipolar signs of the closest 10 hydrogens (1024 distributions). The more remote hydrogens in the sphere (if any) have their dipolar signs assigned according to a 50% random chance. 615 At the inner most level, a closed loop generates an isotropic spherical distribution (5000 orientations) for the (unit) "magnetic field" vector (http://corysimon.github.io/articles/uniformdistn-on-sphere/).
The angle between the (unit) magnetic field vector and the dipolar vector between HN and the surrounding proton is computed as the arccosine of the (normalized) dot product of those vectors.
The local dipolar field of an individual surrounding proton at 1 HN is then calculated according to Equation 14. 620 This is repeated for all surrounding protons in the shell and co-added.
At this stage the program has the local field for a certain 1 HN, in a certain orientation, for a certain permutation of surrounding dipole signs. This repeated for all 5000 orientations, so that it obtains the 1 HN powder pattern for a certain permutation of the surrounding dipole signs.
Subsequently the corresponding solution NMR line width is computed from this distribution by the method of second moments 625 (Equation 15). The next step is to repeat this for all 1024 permutations. The line widths are all added and normalized yielding the inhomogeneous linewidth. The inverse line widths, which are proportional to the peak height, are also added.
After that the outer loop advances to the next HN.
The program is written in Fortran90 , and contains no references to outside libraries. The source code is available from the 630 author and will be deposited at https://github.com.  Figure A1 shows a comparison between the line widths computed for GB1with and without cross correlation. The data with 635 cross correlations was computed using the "solid state" approach above, taking into account all protons within a sphere of 6 Å. For the non-cross correlated data, we used the same approach, but instead of calculating 1024 different specific permutations, we used 1024 random distributions of surrounding dipoles, and averaged those. The Figure shows that there can be upto 1.5 Hz differences between the two methods, but there is no systematic change, and is of no relevance to our current level of computational precision. But if future calculations ask for refinement, one must include the cross correlations.

Rotational correlations times
645 (Daragan and Mayo, 1997) noted a deviation between the tC values calculated for proteins from the Stokes-Einstein relation and the experimental values, which were then known for proteins smaller than 18 kDa. Fitting to that data, they obtained the empirical Mr vs. tC relationship: .
[17] 650 with T in 0 K. We collected several more experimental rotational correlation times (see Table A1), and find that equation 17 also holds outside the range for which it was developed.  Figure A2. Experimentally determined rotational correlation times for different protein masses (dots) as collected from the literature, measured or corrected to 298K using the known temperature dependency of the water viscosity (see Table A1). The drawn line was computed using Equation 17, also at 298K.  Legend to Table A1: (a) converted using (Weast, 1973) (b) Values listed in (de la Torre et al., 2000) (c) Experimental values from the North East Structural Genomics initiative listed at 660