Localising nuclear spins by pseudocontact shifts from a single tagging site
- 1ARC Centre of Excellence for Innovations in Peptide & Protein Science, Research School of Chemistry, Australian National University, Canberra, ACT 2601, Australia
- 2Department of Chemistry, Loughborough University, Epinal Way, Loughborough, LE11 3TU, United Kingdom
Correspondence: Gottfried Otting (firstname.lastname@example.org)
Ligating a protein at a specific site with a tag molecule containing a paramagnetic metal ion provides a versatile way of generating pseudocontact shifts (PCSs) in nuclear magnetic resonance (NMR) spectra. PCSs can be observed for nuclear spins far from the tagging site, and PCSs generated from multiple tagging sites have been shown to enable highly accurate structure determinations at specific sites of interest, even when using flexible tags, provided the fitted effective magnetic susceptibility anisotropy (Δχ) tensors accurately back-calculate the experimental PCSs measured in the immediate vicinity of the site of interest. The present work investigates the situation where only the local structure of a protein region or bound ligand is to be determined rather than the structure of the entire molecular system. In this case, the need for gathering structural information from tags deployed at multiple sites may be queried. Our study presents a computational simulation of the structural information available from samples produced with single tags attached at up to six different sites, up to six different tags attached to a single site, and in-between scenarios. The results indicate that the number of tags is more important than the number of tagging sites. This has important practical implications, as it is much easier to identify a single site that is suitable for tagging than multiple ones. In an initial experimental demonstration with the ubiquitin mutant S57C, PCSs generated with four different tags at a single site are shown to accurately pinpoint the location of amide protons in different segments of the protein.
Pseudocontact shifts (PCSs), which are generated by paramagnetic metal ions with fast-relaxing electron spins, provide outstanding restraints for the structure refinement of biological macromolecules (Luchinat et al., 2018). To make full use of PCS restraints, a large number of metal tags have been developed in recent years with the express purpose to install a paramagnetic centre on proteins, measure PCSs, and gain structural information (Liu et al., 2014; Nitsche and Otting, 2017; Su and Chen, 2019; Joss and Häussinger, 2019; Saio and Ishimori, 2020; Miao et al., 2022; Müntener et al., 2022). With restraints from multiple tagging sites, PCSs of backbone amide protons have been shown to be sufficient for 3D structure determinations of proteins (Schmitz et al., 2012; Yagi et al., 2013; Pilla et al., 2016, 2017; Cucuzza et al., 2021). Particularly appealing applications of PCSs have been the structure refinement of polypeptide segments in proteins of known 3D fold (Banci et al., 1996; Crick et al., 2015; Lescanne et al., 2018; Müntener et al., 2020) and the structural characterisation of the interfaces of protein–protein (Pintacuda et al., 2006; Keizers et al., 2010; Kobashigawa et al., 2012; Hass and Ubbink, 2014; de la Cruz et al., 2011; Brewer et al., 2015; Ubbink and Di Savino, 2018) and protein–ligand complexes (Pintacuda et al., 2007; Guan et al., 2013; Chen et al., 2016; Lescanne et al., 2018; Zimmermann et al., 2019), as the long-range nature of PCSs allows the attachment of the metal tags at a distance that is sufficiently far from the site of interest to avoid structural perturbations by the tag.
The present work investigates the optimal tagging strategy for determining structural detail at a selected local site in a protein (site of interest – SoI), assuming that the 3D structure of the rest of the protein is known. In this scenario, the PCSs observed for the structurally known part can be used to determine the parameters of the magnetic susceptibility anisotropy (Δχ) tensor associated with the paramagnetic tag, and the remaining PCSs can be used to determine the 3D structure of the SoI. Fundamentally, the same scenario applies when the binding mode of a ligand or the structure of a protein–protein interface is to be determined.
The structure determination problem can be approached in two principally different ways. (i) Different sites can be selected in the protein for tagging. Most metal tags are designed for ligation to the thiol groups of one or two cysteine residues, which can be introduced into the target protein by mutagenesis of each of the target sites. This is the most common strategy (Keizers et al., 2010; Yagi et al., 2013; Guan et al., 2013; Crick et al., 2015; Chen et al., 2016; Lescanne et al., 2017; Pearce et al., 2017; Zimmermann et al., 2019; Orton et al., 2022a). (ii) A single site is selected in the protein and several different samples are prepared with different paramagnetic metal ions. This approach was applied to the N-terminal domain of the E. coli DNA polymerase subunit ϵ to elucidate the binding mode of a small ligand (John et al., 2006) and the subunit θ (Pintacuda et al., 2006). The approach proved successful, even though the precision of the solutions was affected by a fairly close alignment of the (Δχ) tensors. This strategy, however, can be generalised by the use of different tags capable of delivering significantly different orientations of the magnetic susceptibility tensors relative to the target protein. To the best of our knowledge, the relative performance of these tagging strategies in terms of defining local structure has not been explored systematically.
The simulations performed in the present work predict that the chances of obtaining good structural restraints at the SoI are almost equally good when the same tag is deployed at multiple sites compared to when a single site is furnished with different tags. To gain an understanding of this effect, we consider the way in which PCSs determine the positions of nuclear spins, develop metrics for determining the precision with which the nuclear spins at the SoI can be localised, and calculate the chances of obtaining good localisation precision by assuming different numbers of tags and tagging sites and that the relative orientations of the Δχ tensors cannot be predicted and arise by chance.
In a practical demonstration, we use PCSs generated by four different tags attached at residue 57 of ubiquitin S57C to determine the positions of amide protons in different polypeptide segments of the protein.
1.1 Localisation spaces
The position of a nuclear spin can be defined as the 3D space in which it is to be found according to the measured PCS values and their associated uncertainties. We refer to this space as the localisation space. The localisation space is defined by multiple PCSs associated with different Δχ tensors.
The PCS of a nuclear spin is manifested in nuclear magnetic resonance (NMR) spectra as the change in chemical shift observed in the paramagnetic state compared to a diamagnetic reference, which is the same compound produced with a diamagnetic metal ion. Equation (1) describes the PCS of the nuclear spin, where r is the electron–nuclear distance, x, y, and z are the nuclear coordinates relative to the paramagnetic centre which defines the origin of the coordinate system, and Δχij denote the components of the Δχ tensor in matrix representation.
The Δχ tensor is characterised by eight parameters which define its position, anisotropy, and orientation. These parameters can be determined by fitting to experimentally measured PCSs of nuclear spins located in the structurally known part of the protein. Most often, the PCSs of backbone amide protons are used to determine the parameters of the Δχ tensor.
For each PCS value, the Δχ tensor defines a non-spherical isosurface, which describes the coordinates where this particular PCS value is observed. The PCS isosurfaces are point symmetrical about the metal ion. The PCS measured of a nuclear spin at the SoI thus pins its location to the associated PCS isosurface. A second PCS measured with a second tag introduces a second PCS isosurface as a restraint. Disregarding rare borderline cases, the intersection between two isosurfaces describes a line, the intersection with a third isosurface defines two points, and the intersection with a fourth PCS isosurface restricts the position of the nuclear spin to a single point. Taking into account the experimental uncertainties, the isosurfaces expand to shells, and the common point expands to a 3D localisation space. In this way, multiple known Δχ tensors allow the positions of individual atoms in a molecule to be determined.
1.2 Δχ tensor orthogonality
The determination of localisation spaces requires that PCS isosurfaces intersect, which is readily achieved by positioning metal tags at different sites of the protein. It is not always straightforward, however, to identify multiple tagging sites that are at an optimal distance from the SoI and suitable for chemical modification without significant impact on the structure and function of the target protein. The alternative strategy of varying the isosurfaces by exchanging the paramagnetic ion in the same metal binding site also produces largely unsatisfactory results. For example, it has commonly been observed that the coordinate systems of Δχ tensors associated with two different paramagnetic lanthanoid ions such as Tb(III) and Tm(III) are closely aligned if the tagging site and chemical structure of the tag are the same (Bertini et al., 2001; Su et al., 2008; Keizers et al., 2008; Man et al., 2010; Graham et al., 2011; Joss et al., 2018). Even if the respective Δχ tensors differ slightly in their relative orientation, their PCS isosurfaces are likely to intersect at a shallow angle. In this situation, the intersection line between the two tensors can shift a lot in response to small inaccuracies in the Δχ tensors, limiting the precision with which the localisation space can be determined. The situation is different, however, if different chemical tags are used to install the paramagnetic centres, as the coordinate systems of the associated Δχ tensors are more likely to be oriented differently than to align. Furthermore, the exact positions of the different tags tend to differ to some extent, even when they are attached at the same site.
This work aims to quantify the precision with which localisation spaces can be determined by PCSs, assuming a variable number of tagging sites on the protein, which are arranged in different geometries. To address the difficulty in predicting the relative orientations of the respective Δχ tensors, the analysis employed computational Monte Carlo methods to arrive at statistical likelihoods of successful determinations of precise localisation spaces.
The boundaries of an experimentally determined localisation space are determined by uncertainties in the Δχ tensor fits and the error associated with each individual PCS measurement. In the following, the problem is simplified by assuming that the protein coordinates underpinning the Δχ fits are correct and the Δχ tensors free of uncertainties.
We consider two different metrics for measuring the precision with which the localisation space of a nuclear spin can be determined from PCSs arising from multiple Δχ tensors. The first metric involves a discrete integration on a grid to capture a volume which fulfils a root mean square deviation (RMSD) criterion of experimental versus calculated PCS values. The second metric considers the relative geometry of PCS gradient vectors at the site of interest.
2.1 RMSD volume metric
Given n different Δχ tensors, Eq. (2) calculates the RMSD between the experimentally measured PCS, δexp,i, and the calculated PCS, δcal,i, for the ith Δχ tensor. δRMSD can be evaluated at any position of the SoI by calculating δcal,i from Eq. (1).
By evaluating Eq. (2) over a 3-dimensional grid of points and identifying the total number of grid points with an RMSD below a given threshold (), a volume is obtained that encapsulates all solutions below the RMSD threshold. This definition is encompassed by Eq. (3), where the PCS RMSD volume VRMSD is calculated over a grid of points with the coordinates x, y, and z, and ΔV is the volume occupied by a single grid point.
The threshold RMSD can be chosen to reflect the experimental uncertainty in the measured PCS value. The RMSD volume VRMSD then describes the localisation space of the nuclear spin, where a small value means its position can be determined precisely.
2.2 PCS gradient metric
Considering a single Δχ tensor, the space of solutions, where the experimental and theoretical PCSs are the same, is defined by an isosurface. In the immediate vicinity of a point on the isosurface, the local environment is approximated by a plane which can be described by a vector normal to the surface. The vector is defined by the PCS gradient field ∇δPCS, which can be written as shown in Eq. (5). The gradient vector also describes the direction in which the position of the nuclear spin can be determined with the greatest precision as it corresponds to the direction of maximum change in the predicted PCS value.
Considering two or three Δχ tensors and using the intersection between the associated PCS isosurfaces to localise a nuclear spin, the most precise localisation is achieved when the PCS isosurfaces intersect in an orthogonal manner. A quantitative measure of orthogonality can be defined in different ways. For example, orthogonality can be characterised by the dot product of vectors, which is akin to the angle score developed by Zimmermann et al. (2019). Equation (6) defines the parallel metric δ∥ as the sum of the absolute values of the pairwise dot products of the normalised PCS gradient vectors v. Equation (7) is equivalent, where θij is the angle between the gradient vectors vi and vj. The parallel metric can assume a value between 0 and 1 and is small for nearly orthogonal PCS gradients. Note that a value of zero can only be obtained when the number of Δχ tensors considered is three or less.
As the PCS scales with r−3, the steepest PCS gradients occur close to the paramagnetic centre. This information is contained in the PCS gradient vector ∇δPCS but is discarded in Eq. (6) due to the normalisation of the vectors. To account for proximity to the paramagnetic centre, the perpendicular metric δ⟂ is proposed as defined by Eq. (8), which preserves the gradient vector magnitudes through a pairwise cross-product (denoted by the ∧ symbol). Equation (9) is equivalent, where θij is the angle between the gradient vectors vi and vj.
The perpendicular metric awards a large value to Δχ tensor geometries that produce large and orthogonal PCS gradient vectors. The metric has a lower bound of 0 and no upper bound.
2.3 Model of tagging geometries
To compare the accuracy with which localisation spaces can be determined by PCSs generated by a different number of tags positioned at a different number of sites, we assumed a simple model of a globular macromolecule with metal tags attached to its surface and performed Monte Carlo simulations to sample Δχ tensors (referred to in the following as tags) with fixed positions and anisotropy but random orientations. The tagging sites (identical to the location of each paramagnetic centre) were located on the surface of a sphere of radius 25 Å, chosen to represent the macromolecule. Tagging sites were placed at even divisions of a circle on the sphere, each at the same distance from the nucleus of interest (Fig. 1). Calculations were performed for different numbers of tagging sites ranging from 1 to 6. In addition, the multiplicity of tags at a given site was varied such that the tags were distributed between the tagging sites as evenly as possible. For example, with five tags and three tagging sites, one tag was at site 1 and two tags each at sites 2 and 3.
2.4 Sampling tensor orientations
For any given geometry of sites and tag multiplicity, the tensor orientations were sampled from a random uniform distribution. The PCS RMSD volume was then calculated at the SoI using Eq. (3). This process was repeated 10 000 times to obtain a histogram depicting the frequencies with which different localisation volumes VRMSD were obtained. Figure 2 shows smoothed representations of these distributions for a nuclear spin located 20 Å from each of the paramagnetic centres, calculated for of 0.03 ppm (parts per million), where the value of 0.03 ppm was chosen as representative of typical experimental uncertainties in PCS measurements of protein NMR signals.
Figure 2 summarises the results of the calculations for different numbers of tags and tagging sites. As expected, a greater number of tags improves the chances that the PCSs of the nuclear spin define a small localisation space. Furthermore, distributing a given number of tags over more tagging sites generally decreases the localisation volumes. Importantly, the calculations produced some non-intuitive results. (i) The advantage gained by a large number of tags is small. While the use of four instead of three tags may still deliver a significant reduction in the localisation space, the advantage gained by using more than four tags is likely to be very small. (ii) In general, using more tags is of greater benefit than using more tagging sites. This is evidenced by consistently smaller RMSD volumes associated with, e.g., the 95th percentile when using n+1 rather than n tags. (iii) Multiple different tags attached at the same site deliver localisation spaces only a little less confined than when the same number of tags is distributed over different sites. These findings have the potential to greatly simplify the structural characterisation of a specific SoI by PCSs, as it is much easier to install different tags in the same protein mutant than to identify a multitude of suitable tagging sites.
Closer inspection of Fig. 2b reveals a few irregularities arising from the specific geometries chosen and limitations associated with a finite grid. For example, the two-site geometry, which corresponds to a linear arrangement of tag–SoI–tag, was occasionally predicted to perform less well than the same number of tags located at a single site. The irregularity is particularly pronounced for the three distributions associated with the three-tag scenario in Fig. 2b. In this case, the largest localisation spaces (approximately the top 30 %) extended beyond the boundary of the grid so that their true volume sum VRMSD was larger than calculated and plotted. Although very large volumes can arise when PCS isosurfaces intersect at a shallow angle, the resulting localisation spaces feature long extensions, most of which may well be incompatible with the chemical structure of a SoI. Therefore, such extended localisation spaces may still present very useful structure restraints. More concerning is that the three-tag scenario also indicated that the median of the localisation space distribution was larger for two tagging sites than one. Closer inspection revealed that this anomaly was again caused by the limited grid volume used, which failed to capture every point below the RMSD threshold. The expected correlation between VRMSD and the number of tagging sites was re-established when the calculations were repeated with a larger grid volume, as shown in Fig. 2a.
While a finite grid size can adequately probe the environment of a SoI, anomalies were encountered when the PCS isosurfaces defined more than a single common intersection point. In 3D space, the PCS isosurfaces of four different Δχ tensors intersect at a single point, whereas the restraints posed by three PCSs can generally be fulfilled by two separate localisation spaces. Figure 3 illustrates this well-known phenomenon (Kobashigawa et al., 2012; Guan et al., 2013) by an example in 2D space, showing how two Δχ tensors produce two separate intersection points. Importantly, the intersection points are much better separated from each other if the tags are placed at the same location rather than at two different sites. As the calculations of Fig. 2 used a finite grid size (symbolised by a grey-shaded box in Fig. 3), the three-tag calculations were more likely to capture both solutions in the two-site versus one-site simulations. Although this effect artificially disadvantages the two-site results in the calculations, it is nonetheless of practical relevancy, as a large separation between two possible localisation spaces more readily distinguishes one of the solutions as incompatible with the covalent structure of the molecule.
2.5 Assessing the quality of localisation spaces
The metrics of Eqs. (8) and (6) are designed to characterise the quality of a localisation space determination by a single number. Figure 4 shows that the perpendicular metric tends to correlate with the localisation volume more closely than the parallel metric. In particular, the largest values of δ⟂ correlate very well with the smallest localisation volumes. Therefore, in the following, we mainly consider the perpendicular metric, but also note its limited information content, as the correlation between large δ⟂ values and small localisation volume breaks for small values of the metric. Notably, localisation spaces can assume very irregular shapes, which include points entirely incompatible with the covalent structure of the molecule. In many cases, the attempt to characterise the quality of a localisation space by a single one of the metrics above very much underestimates the quality of the actual localisation space.
To verify the performance of four different tags attached at a single site experimentally, we produced samples of the uniformly 15N-labelled ubiquitin mutant S57C and tagged the cysteine residue with four different thulium tags. The tags were the C1-Tm3+ (Graham et al., 2011), C2-Tm3+ (de la Cruz et al., 2011), C12-Tm3+ (Herath et al., 2021), and C13-Tm3+ tags, where the C2 and C13 tags are the opposite enantiomers of the C1 and C12 tags, respectively (Fig. 5). The C13 tag was synthesised for this work, following the general protocol published previously for the C12 tag. In addition, diamagnetic references were generated by tagging the protein with the respective diamagnetic tags loaded with Y3+ ions. The programme Paramagpy (Orton et al., 2020) was used to determine the Δχ tensors from the PCSs measured for backbone amide protons in [15N,1H]-HSQC spectra by fitting to the crystal structure 1UBQ (Vijay et al., 1987). The PCSs and Δχ tensors are reported in Tables S1 and S2, respectively, in the Supplement. The quality factors of the Δχ tensor fits were very good, ranging between 0.014 and 0.019. The paramagnetic centres of all four tags were positioned between 7.6 and 8.3 Å from the Cβ atom of the residue in position 57, which is in good agreement with expectations based on the covalent structures of the tags.
Figure 6 illustrates the Δχ tensors obtained, depicting the associated PCS isosurfaces for PCSs of ±1 ppm. The tensors associated with the C12 and C13 tags are larger than those obtained with the C1 and C2 tags, which can be attributed to the conformationally more rigid tether between protein and metal complex (Fig. 5). The z axes of the Δχ tensors (indicated by the blue isosurfaces) are oriented differently, but, except for the C12 tag, their relative orientations bear a degree of similarity. In particular, the angle between the z axes of the tensors associated with the C2 and C13 tags happened to be quite small (7∘), but, with tensor origins differing by 4.8 Å and different rhombicities, the respective PCS isosurfaces would intersect anyway. Similarly, despite their attachment to the same position in the amino acid sequence of ubiquitin, the Δχ tensor fits positioned the paramagnetic centres of different tags somewhat differently, with a distance of 3.7 Å between the lanthanoid ions in the C1 and C2 tags and 7.2 Å in the C12 and C13 tags.
To assess the capacity of the PCSs to define the correct localisation spaces, we determined the localisation spaces of backbone amide protons in three selected polypeptide segments A–C of ubiquitin, comprising residues 12–17, 29–36, and 64–68, respectively. They were selected because most of the amide protons in these segments were characterised by PCS data measured with all four different paramagnetic tags. All three segments featured δ⟂ metrics below 0.06 (Fig. 7; Table S3), i.e. lower than required for predicting tight localisation spaces with confidence. Nonetheless, the amide protons with perpendicular metrics >0.025 generally were associated with localisation spaces that closely trace the crystal structure (Fig. 8).
Among the segments A–C, the match between localisation spaces and crystal structure is least satisfying for residues 32–35. To assess the possibility of an artefact arising from PCS isosurfaces intersecting at particularly shallow angles, we calculated the pairwise intersection angles of the isosurfaces for all segments shown in Fig. 8 (Table S4). The smallest intersection angle in segment B was found for Ile36 (11∘), but even smaller intersection angles occurred in segments A and C. Furthermore, as the loop region with residues 32–35 at the C-terminus of the α helix is far from the tagging site, it is unlikely that the presence of the tags affected the structure of the loop. Interestingly, however, this loop is known to be flexible and low order parameters have been reported (Lange et al., 2008). In agreement with this view of conformational disorder, the localisation spaces suggest that the loop is slightly more open than in the crystal structure.
In general, the uncertainties associated with the localisation spaces depend on the reliability of the Δχ tensors, which depend on the accuracy with which PCSs can be measured, the reliability of the protein structure used to fit the Δχ tensors, and the validity of the assumption that each set of PCS data can be fitted reliably by a single effective Δχ tensor. As a first step towards estimating the boundaries of the localisation spaces, a PCS RMSD value can be defined which corresponds to the root mean square deviations (RMSDs) between the experimental PCS values and the PCS values back-calculated at different points in space. The localisation spaces shown in Fig. 8 comprise the points below a certain PCS RMSD threshold which, for illustration purposes, was chosen individually for each amide proton to display localisation spaces of similar volume. These plots therefore trace the general shapes of the localisation spaces and indicate the points of best fit, rather than attempting to give a quantitative account of the respective uncertainties. In each polypeptide segment, the shapes of the localisation spaces are elongated in more or less the same direction, which can be attributed to PCS isosurfaces intersecting at a relatively shallow angle and must not be confused with the shapes of B-factors of crystal structures.
Most of the tags designed for the generation of PCSs in proteins are based on complexes of lanthanoid ions that can be attached to single cysteine residues. Many variants of such single-arm tags have been produced with the aim of generating large PCSs, avoiding multiple tag conformations, producing stable bonds with cysteines, and keeping structural perturbations of the tagged proteins to a minimum (Nitsche and Otting, 2017; Joss and Häussinger, 2019; Su and Chen, 2019; Miao et al., 2022; Müntener et al., 2022). As shown in the present work, the availability of a variety of tags turns out to be of great benefit for structure determinations of specific sites of interest, as they allow the generation of Δχ tensors of different relative orientations with a single tagging site. The example of the C1 and C12 tags shows that very different Δχ tensor orientations can be obtained if the same lanthanoid complex is attached to the protein via different chemical tethers. In addition, very different Δχ tensor orientations can be obtained simply by changing the chirality of the pendants in the lanthanoid complex, as illustrated by the C12 and C13 tags (Fig. 6).
The tethers of all single-arm tags are invariably flexible, leading to variable positions of the lanthanoid ions relative to the target molecule and also variable orientations of the lanthanoid ion complexes. In this situation, the PCSs generated by a tag in the target molecule reflect the average of a multitude of different Δχ tensors and Δχ tensor orientations. Interpreting the average PCSs by a single effective Δχ tensor is an approximation that nonetheless delivers meaningful predictions of PCSs of nuclear spins that are located not too far from the nuclear spins used for fitting the effective Δχ tensor (Shishmarev and Otting, 2013). The orientation of an effective Δχ tensor is predominantly governed by the steric and chemical environment of the tag and therefore difficult to predict. It has been observed many times, however, that different paramagnetic metal ions generally produce very similar Δχ tensor orientations, if the chemical structure of the tag and tagging site are the same (Su et al., 2008; Keizers et al., 2008; Man et al., 2010; Graham et al., 2011; Joss et al., 2018; Zimmermann et al., 2019; Orton et al., 2022a). Similar Δχ tensor orientations lead to shallow intersection angles of the respective PCS isosurfaces. In this situation, which can be identified by a very small δ⟂ metric, relatively large localisation volumes are obtained when PCS isosurfaces are expanded into shells to take uncertainties in PCSs and tensor orientations into account. Shallow intersection angles are commonly observed when the same tag loaded with a different paramagnetic lanthanoid ion (such as Tb3+ versus Tm3+) is deployed at the same tagging site. Previous work, where we used the PCSs from three different tagging sites to determine the location of a solvent-exposed protein loop, showed that the localisation coordinates determined by PCSs were more accurate when all data encompassed PCSs measured with only a single paramagnetic metal ion (Orton et al., 2022a). Therefore, we conducted the experiments with ubiquitin using Tm3+ tags only.
The difficulty in predicting Δχ tensor orientations means that two tags may accidentally produce Δχ tensors of a very similar orientation. In the case of ubiquitin S57C, this situation arose for the Δχ tensors of the C1 and C2 tags. On the other hand, the Δχ tensor orientations can be significantly different between two tags that differ only in the chirality of the lanthanoid coordination sphere, as in the case of the C12/C13 pair. Our results also suggest that the perpendicular metric of Eq. (8) is a very conservative predictor of the accuracy with which localisation spaces are defined by the PCSs. As demonstrated by the example of ubiquitin, a poor perpendicular metric does not exclude the determination of fairly accurate localisation spaces.
The results of the present work are of great practical value, as they demonstrate that a single cysteine mutant of a target protein can be sufficient to determine accurate localisation spaces from PCSs. Most importantly, the accuracy with which the localisation spaces at a site of interest can be determined by PCSs in a multiple-tag/single-site scenario is statistically comparable to that obtained with the same number of tags deployed at different tagging sites. Not all solvent-exposed residues of a protein are suitable tagging sites after mutation to cysteine because of structural importance (e.g. salt bridges, hydrogen bonds, and secondary structure propensities) or functional impacts (such as allosteric effects and solubility), and the effect of mutations is often hard to predict. The ability to work with a single site for tagging can thus be of critical importance, aside from the convenience of working with a single protein construct rather than multiple mutants. In a fortuitous development, an increasing number of different lanthanoid tags have recently been published that are designed for attachment to cysteine residues and deliver different Δχ tensor orientations (Miao et al., 2022; Müntener et al., 2022). The present work shows that different enantiomers of the same tag can produce quite different Δχ tensor orientations. Furthermore, the average positions of the lanthanoid ion can differ between different enantiomeric forms of the tag, which increases the likelihood of Δχ tensors intersecting at a steeper angle.
The statistical analysis of the present work suggests that four different tags have a high chance of defining a small localisation space. In the rare situation where four different tags at a single site fail to deliver a sufficiently well-defined localisation space, chances are that a fifth tag deployed at the same site substantially reduces the localisation space and does this better than if it were attached at an alternative tagging site, which may be available only at a less optimal distance from the SoI.
Attaching four paramagnetic tags still requires four different tagging reactions, and in principle, the production of diamagnetic references requires another set of four tagging reactions with the corresponding tags loaded with diamagnetic metal ions. A further simplification may be possible in rigid proteins, however, if the Δχ tensor fits use only PCSs from regions of the protein that are sufficiently far from the tagging site to display conserved chemical shifts with and without a tag. In this case, the protein without a tag may serve as the diamagnetic reference, removing the need to ligate the protein with diamagnetic tags. It is commonly observed that the chemical shift changes introduced by diamagnetic tags are negligible for most of the protein, even when the protein is small (Ma et al., 2021). An optimal tagging site is well separated from the SoI to minimise any structural perturbation and paramagnetic relaxation enhancements, while generating sizeable PCSs at the SoI that are easily measured and assigned.
The uncertainties of the localisation spaces depend on a number of different factors, some of which are difficult to capture in a rigorous manner. As mentioned above, most tags are flexible, and explaining the PCSs by a single effective Δχ tensor introduces errors for tags undergoing large translational movements of the metal ion relative to the protein. The magnitude of these errors can be controlled, if the PCSs of nuclei near the SoI are well explained by the fit of the effective Δχ tensor (Shishmarev and Otting, 2013).
Another source of uncertainty is associated with the uncertainty of the protein structure in solution. The structural flexibility of ubiquitin has been documented in great detail (Fenwick et al., 2011; Maltsev et al., 2014). Previous work using PCSs and residual dipolar couplings (RDCs) generated by the C1-Tb3+ tag at different sites established that the different structures of ubiquitin yielded similar quality factors in the Δχ tensor fits, although, by a small margin, the dynamic structure 2KOX fitted best (Pearce et al., 2017). In the present work, we settled on using the crystal structure 1UBQ (Vijay et al., 1987) for simplicity.
Spectral overlap and paramagnetic peak broadening present two additional sources of uncertainties, which are easier to quantify. In the case of the ubiquitin data used in this work, the uncertainty of the PCS data associated with the amide protons of segments A–C was small (estimated to be less than ±0.01 ppm). A straightforward approach that indirectly captures the effect of most of the sources of uncertainty on the localisation spaces relies on performing the Δχ tensor fits multiple times with random omission of a certain percentage of the PCSs (Orton et al., 2022a).
In the present work, we did not attempt to investigate the impact of variable Δχ tensor fits on the localisation spaces for the polypeptide segments of ubiquitin shown in Fig. 8, as the protein was only used as a model system to evaluate the potential of using a single tagging site with four different tags. The C1, C2, C12, and C13 tags were the only tags we tested with this ubiquitin mutant, and this set of tags delivered Δχ tensors with suboptimal δ⟂ metrics and some shallow intersections between PCS isosurfaces. Nonetheless, the localisation spaces agreed remarkably well with the crystal structure, as indicated by their close proximity to the coordinates of the respective amide protons (Fig. 8). More experience with different protein targets and tags will be required to establish the general validity of these observations.
It is well-known that additional data can be extracted from [15N,1H]-HSQC spectra, such as residual dipolar couplings (RDCs) arising from paramagnetic molecular alignment in the magnetic field (Prestegard et al., 2000) and cross-correlated relaxation effects between dipolar interactions and Curie spin relaxation, which also contain structural information that depends on the orientation of the Δχ tensor (Pintacuda et al., 2003). Furthermore, the [15N,1H]-HSQC spectra detect PCSs of 15N spins, which can be used to determine localisation spaces, provided the effects from residual anisotropic chemical shifts are taken into account (John et al., 2005). The present work focused on PCS data alone, as chemical shifts can be measured with greater sensitivity than other NMR data.
While the orientations of the Δχ tensors associated with different tag molecules vary and most often cannot be predicted, the statistical predictions of the current work provide helpful guidance for choosing the optimal numbers of tagging sites and tags. In the case of a single selected site of interest, such as the conformation of a loop region, the binding site of a ligand or a protein–protein interface, our results indicate that PCSs generated with a single tagging site and four different tags most likely yield structural information that is nearly as good as that obtained with a single tag deployed at four different sites. As mutations and chemical modifications can alter the structure and function of a protein and also the NMR resonance assignments at a SoI, it is of great practical advantage if only a single suitable tagging site needs to be identified rather than multiple sites. The benefit applies equally to the use of PCSs for protein resonance assignments based on a known 3D structure of the protein (Pintacuda et al., 2004; John et al., 2007; Lescanne et al., 2017). As many different tags have been published in recent years, we believe that the present findings will greatly increase the appeal of strategies based on paramagnetic lanthanoid tags.
The supplement related to this article is available online at: https://doi.org/10.5194/mr-3-65-2022-supplement.
HWO initiated the project, wrote the software, performed the calculations, and wrote the initial version of the paper. EHA produced the ubiquitin samples and measured and assigned the HSQC spectra. GO edited the final version of the paper. LT and SJB synthesised the C12 and C13 tags with different metal ions.
At least one of the (co-)authors is a member of the editorial board of Magnetic Resonance. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Gottfried Otting thanks the Australian Research Council, for a Laureate Fellowship (grant no. FL170100019).
This research has been supported by the Australian Research Council (grant no. FL170100019) and the Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science (grant no. CE200100012).
This paper was edited by Mehdi Mobli and reviewed by Daniel Häussinger and one anonymous referee.
Banci, L., Bertini, I., Bren, K. L., Cremonini, M. A., Gray, H. B., Luchinat, C., and Turano, P.: The use of pseudocontact shifts to refine solution structures of paramagnetic metalloproteins: Met80Ala cyano-cytochrome c as an example, J. Biol. Inorg. Chem., 1, 117–126, https://doi.org/10.1007/s007750050030, 1996. a
Bertini, I., Janik, M. B. L., Lee, Y. M., Luchinat, C., and Rosato, A.: Magnetic susceptibility tensor anisotropies for a lanthanide ion series in a fixed protein matrix, J. Am. Chem. Soc., 123, 4181–4188, https://doi.org/10.1021/ja0028626, 2001. a
Brewer, K. D., Bacaj, T., Cavalli, A., Camilloni, C., Swarbrick, J. D., Liu, J., Zhou, A., Zhou, P., Barlow, N., Xu, J., Seven, A. B., Prinslow, E. A., Voleti, R., Häussinger, D., Bonvin, A. M. J. J., Tomchick, D. R., Vendruscolo, M., Graham, B., Südhof, T. C., and Rizo, J.: Dynamic binding mode of a synaptotagmin-1-SNARE complex in solution, Nat. Struct. Mol. Biol., 22, 555–564, https://doi.org/10.1038/nsmb.3035, 2015. a
Chen, W.-N., Nitsche, C., Pilla, K. B., Graham, B., Huber, T., Klein, C. D., and Otting, G.: Sensitive NMR approach for determining the binding mode of tightly binding ligand molecules to protein targets, J. Am. Chem. Soc., 138, 4539–4546, https://doi.org/10.1021/jacs.6b00416, 2016. a, b
Crick, D. J., Wang, J. X., Graham, B., Swarbrick, J. D., Mott, H. R., and Nietlispach, D.: Integral membrane protein structure determination using pseudocontact shifts, J. Biomol. NMR, 61, 197–207, https://doi.org/10.1007/s10858-015-9899-6, 2015. a, b
Cucuzza, S., Güntert, P., Plückthun, A., and Zerbe, O.: An automated iterative approach for protein structure refinement using pseudocontact shifts, J. Biomol. NMR, 75, 319–334, https://doi.org/10.1007/s10858-021-00376-8, 2021. a
de la Cruz, L., Nguyen, T. H. D., Ozawa, K., Shin, J., Graham, B., Huber, T., and Otting, G.: Binding of low molecular weight inhibitors promotes large conformational changes in the dengue virus NS2b-NS3 protease: fold analysis by pseudocontact shifts, J. Am. Chem. Soc., 133, 19205–19215, https://doi.org/10.1021/ja208435s, 2011. a, b
Fenwick, R. B., Esteban-Martín, S., Richter, B., Lee, D., Walter, K. F., Milovanovic, D., Becker, S., Lakomek, N. A., Griesinger, C., and Salvatella, X.: Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition, J. Am. Chem. Soc., 133, 10336–10339, https://doi.org/10.1021/ja200461n, 2011. a
Graham, B., Loh, C. T., Swarbrick, J. D., Ung, P., Shin, J., Yagi, H., Jia, X., Chhabra, S., Barlow, N., Pintacuda, G., Huber, T., and Otting, G.: DOTA-amide lanthanide tag for reliable generation of pseudocontact shifts in protein NMR spectra, Bioconjug. Chem., 22, 2118–2125, https://doi.org/10.1021/bc200353c, 2011. a, b, c, d
Guan, J.-Y., Keizers, P. H. J., Liu, W.-M., Löhr, F., Skinner, S. P., Heeneman, E. A., Schwalbe, H., Ubbink, M., and Siegal, G.: Small-molecule binding sites on proteins established by paramagnetic NMR spectroscopy, J. Am. Chem. Soc., 135, 5859–5868, https://doi.org/10.1021/ja401323m, 2013. a, b, c
Hass, M. A. S. and Ubbink, M.: Structure determination of protein–protein complexes with long-range anisotropic paramagnetic NMR restraints, Curr. Opin. Struct. Biol., 24, 45–53, https://doi.org/10.1016/j.sbi.2013.11.010, 2014. a
Herath, I. D., Breen, C., Hewitt, S. H., Berki, T. R., Kassir, A. F., Dodson, C., Judd, M., Jabar, S., Cox, N., Otting, G., and Butler, S. J.: A chiral lanthanide tag for stable and rigid attachment to single cysteine residues in proteins for NMR, EPR and time-resolved luminescence studies, Chem. Eur. J. 27, 13009–13023, https://doi.org/10.1002/chem.202101143, 2021. a, b
John, M., Park, A. Y., Pintacuda, G., Dixon, N. E., and Otting, G.: Weak alignment of paramagnetic proteins warrants correction for residual CSA effects in measurements of pseudocontact shifts, J. Am. Chem. Soc., 127, 17190–17191, https://doi.org/10.1021/ja0564259, 2005. a
John, M., Pintacuda, G., Park, A. Y., Dixon, N. E., and Otting, G.: Structure determination of protein-ligand complexes by transferred paramagnetic shifts, J. Am. Chem. Soc., 128, 12910–12916, https://doi.org/10.1021/ja063584z, 2006. a
John, M., Schmitz, C., Park, A. Y., Dixon, N. E., Huber, T., and Otting, G.: Sequence- and stereospecific assignment of methyl groups using paramagnetic lanthanides, J. Am. Chem. Soc., 129, 13749–13757, https://doi.org/10.1021/ja0744753, 2007. a
Joss, D. and Häussinger, D.: Design and applications of lanthanide chelating tags for pseudocontact shift NMR spectroscopy with biomacromolecules, Prog. Nucl. Mag. Res. Sp., 114–115, 284–312, https://doi.org/10.1016/j.pnmrs.2019.08.002, 2019. a, b
Joss, D., Walliser, R. M., Zimmermann, K., and Häussinger, D.: Conformationally locked lanthanide chelating tags for convenient pseudocontact shift protein nuclear magnetic resonance spectroscopy, J. Biomol. NMR, 72, 29–38, https://doi.org/10.1007/s10858-018-0203-4, 2018. a, b
Keizers, P. H. J., Saragliadis, A., Hiruma, Y., Overhand, M., and Ubbink, M.: Design, synthesis, and evaluation of a lanthanide chelating protein probe: CLaNP-5 yields predictable paramagnetic effects independent of environment, J. Am. Chem. Soc., 130, 14802–14812, https://doi.org/10.1021/ja8054832, 2008. a, b
Keizers, P. H. J., Mersinli, B., Reinle, W., Donauer, J., Hiruma, Y., Hannemann, F., Overhand, M., Bernhardt, R., and Ubbink, M.: A solution model of the complex formed by adrenodoxin and adrenodoxin reductase determined by paramagnetic NMR spectroscopy, Biochemistry, 49, 6846–6855, https://doi.org/10.1021/bi100598f, 2010. a, b
Kobashigawa, Y., Saio, T., Ushio, M., Sekiguchi, M., Yokochi, M., Ogura, K., and Inagaki, F.: Convenient method for resolving degeneracies due to symmetry of the magnetic susceptibility tensor and its application to pseudo contact shift-based protein-protein complex structure determination, J. Biomol. NMR, 53, 53–63, https://doi.org/10.1007/s10858-012-9623-8, 2012. a, b
Lange, O. F., Lakomek, N.-A., Farès, C., Schröder, G. F., Walter, K. F. A., Becker, S., Meiler, S., Grubmüller, H., Griesinger, G., and de Groot, B. L.: Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution, Science, 320, 1471–1475, https://doi.org/10.1126/science.1157092, 2008. a
Lescanne, M., Skinner, S. P., Blok, A., Timmer, M., Cerofolini, L., Fragai, M., Luchinat, C., and Ubbink, M.: Methyl group assignment using pseudocontact shifts with PARAssign, J. Biomol. NMR, 69, 183–195, https://doi.org/10.1007/s10858-017-0136-3, 2017. a, b
Lescanne, M., Ahuja, P., Blok, A., Timmer, M., Akerud, T., and Ubbink, M.: Methyl group reorientation under ligand binding probed by pseudocontact shifts, J. Biomol. NMR, 71, 275–285, https://doi.org/10.1007/s10858-018-0190-5, 2018. a, b
Liu, W.-M., Overhand, M., and Ubbink, M.: The application of paramagnetic lanthanoid ions in NMR spectroscopy on proteins, Coord. Chem. Rev., 273–274, 2–12, https://doi.org/10.1016/j.ccr.2013.10.018, 2014. a
Ma, B., Chen, J.-L., Cui, C.-Y., Tang, F., Gong, Y.-J., and Su, X.-C.: Rigid, highly reactive and stable DOTA-like tags containing a thiol-specific phenylsulfonyl pyridine moiety for protein modification and NMR analysis, Chem. Eur. J., 27, 16145–16152, https://doi.org/10.1002/chem.202102495, 2021. a
Maltsev, A. S., Grishaev, A., Roche, J., Zasloff, M., and Bax, A.: Improved cross validation of a static ubiquitin structure derived from high precision residual dipolar couplings measured in a drug-based liquid crystalline phase, J. Am. Chem. Soc., 136, 3752–3755, https://doi.org/10.1021/ja4132642, 2014. a
Man, B., Su, X.-C., Liang, H., Simonsen, S., Huber, T., Messerle, B. A., and Otting, G.: 3-Mercapto-2,6-pyridinedicarboxylic acid, a small lanthanide-binding tag for protein studies by NMR spectroscopy, Chem. Eur. J., 16, 3827–3832, https://doi.org/10.1002/chem.200902904, 2010. a, b
Miao, Q., Nitsche, C., Orton, H. W., Overhand, M., Otting, G., and Ubbink, M.: Paramagnetic chemical probes for studying biological macromolecules, Chem. Rev., https://doi.org/10.1021/acs.chemrev.1c00708, 2022. a, b, c
Müntener, T., Böhm, R., Atz, K., Häussinger, D., and Hiller, S.: NMR pseudocontact shifts in a symmetric protein homotrimer, J. Biomol. NMR, 74, 413–419, https://doi.org/10.1007/s10858-020-00329-7, 2020. a
Orton, H. W., Huber, T., and Otting, G.: Paramagpy: software for fitting magnetic susceptibility tensors using paramagnetic effects measured in NMR spectra, Magn. Reson., 1, 1–12, https://doi.org/10.5194/mr-1-1-2020, 2020. a
Orton, H. W., Herath, I. D., Maleckis, A., Jabar, S., Szabo, M., Graham, B., Breen, C., Topping, L., Butler, S. J., and Otting, G.: Localising individual atoms of tryptophan side chains in the metallo-β-lactamase IMP-1 by pseudocontact shifts from paramagnetic lanthanoid tags at multiple sites, Magn. Reson., 3, 1–13, https://doi.org/10.5194/mr-3-1-2022, 2022a. a, b, c, d
Orton, H. W., Otting, G., Abdelkader, E., Topping, L., and Butler, S.: Supplementary data and code to: One site with multiple tags versus multiple sites with single tags: optimising the determination of localisation spaces by pseudocontact shifts, Zenodo [data set, code], https://doi.org/10.5281/zenodo.6059659, 2022b. a
Pearce, B. J. G., Jabar, S., Loh, C.-T., Szabo, M., Graham, B., and Otting, G.: Structure restraints from heteronuclear pseudocontact shifts generated by lanthanide tags at two different sites, J. Biomol. NMR, 68, 19–32, https://doi.org/10.1007/s10858-017-0111-z, 2017. a, b
Pilla, K. B., Otting, G., and Huber, T.: Pseudocontact shift-driven iterative resampling for 3D structure determinations of large proteins, J. Mol. Biol., 428, 522–532, https://doi.org/10.1016/j.jmb.2016.01.007, 2016. a
Pilla, K. B., Otting, G., and Huber, T.: Protein structure determination by assembling super-secondary structure motifs using pseudocontact shifts, Structure, 25, 559–568, https://doi.org/10.1016/j.str.2017.01.011, 2017. a
Pintacuda, G., Hohenthanner, K., Otting, G., and Müller, N.: Angular dependence of dipole-dipole-Curie-spin cross-correlation effects in high-spin and low-spin paramagnetic myoglobin, J. Biomol. NMR, 27, 115–132, https://doi.org/10.1023/A:1024926126239, 2003. a
Pintacuda, G., Keniry, M. A., Huber, T., Park, A. Y., Dixon, N. E., and Otting, G.: Fast structure-based assignment of 15N-HSQC spectra of selectively 15N-labeled paramagnetic proteins, J. Am. Chem. Soc., 126, 2963–2970, https://doi.org/10.1021/ja039339m, 2004. a
Pintacuda, G., Park, A. Y., Keniry, M. A., Dixon, N. E., and Otting, G.: Lanthanide labeling offers fast NMR approach to 3D structure determinations of protein-protein complexes, J. Am. Chem. Soc., 128, 3696–3702, https://doi.org/10.1021/ja057008z, 2006. a, b
Pintacuda, G., John, M., Su, X.-C., and Otting, G.: NMR structure determination of protein−ligand complexes by lanthanide labeling, Acc. Chem. Res., 40, 206–212, https://doi.org/10.1021/ar050087z, 2007. a
Prestegard, J. H., Al-Hashimi, H. M., and Tolman, J. R.: NMR structures of biomolecules using field oriented media and residual dipolar couplings, Quart. Rev. Biophys., 33, 371–424, https://doi.org/10.1017/S0033583500003656, 2000. a
Saio, T. and Ishimori, K.: Accelerating structural life science by paramagnetic lanthanide probe methods, Biochim. Biophys. Acta-Gen. Subj., 1864, 129332, https://doi.org/10.1016/j.bbagen.2019.03.018, 2020. a
Schmitz, C., Vernon, R., Otting, G., Baker, D., and Huber, T.: Protein structure determination from pseudocontact shifts using ROSETTA, J. Mol. Biol. 416, 668–677, https://doi.org/10.1016/j.jmb.2011.12.056, 2012. a
Shishmarev, D. and Otting, G.: How reliable are pseudocontact shifts induced in proteins and ligands by mobile paramagnetic tags? A modelling study, J. Biomol. NMR, 56, 203–216, https://doi.org/10.1007/s10858-013-9738-6, 2013. a, b
Su, X.-C. and Chen, J.-L.: Site-specific tagging of proteins with paramagnetic ions for determination of protein structures in solution and in cells, Acc. Chem. Res., 52, 1675–1686, https://doi.org/10.1021/acs.accounts.9b00132, 2019. a, b
Su, X. C., McAndrew, K., Huber, T., and Otting, G.: Lanthanide-binding peptides for NMR measurements of residual dipolar couplings and paramagnetic effects from multiple angles, J. Am. Chem. Soc., 130, 1681–1687, https://doi.org/10.1021/ja076564l, 2008. a, b
Ubbink, M. and Di Savino, A.: Chapter 5 Protein–Protein Interactions, in: Paramagnetism in Experimental Biomolecular NMR, edited by: Luchinat, C., Parigi, G., and Ravera, E., The Royal Society of Chemistry, Cambridge, 134–162, https://doi.org/10.1039/9781788013291-00107, 2018. a
Yagi, H., Pilla, K. B., Maleckis, A., Graham, B., Huber, T., and Otting, G.: Three-dimensional protein fold determination from backbone amide pseudocontact shifts generated by lanthanide tags at multiple sites, Structure, 21, 883–890, https://doi.org/10.1016/j.str.2013.04.001, 2013. a, b
Zimmermann, K., Joss, D., Müntener, T., Nogueira, E. S., Schäfer, M., Knörr, L., Monnard, F. W., and Häussinger, D.: Localization of ligands within human carbonic anhydrase II using 19F pseudocontact shift analysis, Chem. Sci., 10, 5064–5072, https://doi.org/10.1039/c8sc05683h, 2019. a, b, c, d