Phosphoserine for the generation of lanthanide-binding sites on proteins for paramagnetic nuclear magnetic resonance spectroscopy

Abstract Pseudocontact shifts (PCSs) generated by paramagnetic lanthanide ions provide valuable long-range structural information in nuclear magnetic resonance (NMR) spectroscopic analyses of biological macromolecules such as proteins, but labelling proteins site-specifically with a single lanthanide ion remains an ongoing challenge, especially for proteins that are not suitable for ligation with cysteine-reactive lanthanide complexes. We show that a specific lanthanide-binding site can be installed on proteins by incorporation of phosphoserine in conjunction with other negatively charged residues, such as aspartate, glutamate or a second phosphoserine residue. The close proximity of the binding sites to the protein backbone leads to good immobilization of the lanthanide ion, as evidenced by the excellent quality of fits between experimental PCSs and PCSs calculated with a single magnetic susceptibility anisotropy ( Δχ) tensor. An improved two-plasmid system was designed to enhance the yields of proteins with genetically encoded phosphoserine, and good lanthanide ion affinities were obtained when the side chains of the phosphoserine and aspartate residues are not engaged in salt bridges, although the presence of too many negatively charged residues in close proximity can also lead to unfolding of the protein. In view of the quality of the Δχ tensors that can be obtained from lanthanide-binding sites generated by site-specific incorporation of phosphoserine, this method presents an attractive tool for generating PCSs in stable proteins, particularly as it is independent of cysteine residues.


Introduction
Paramagnetic labels offer an attractive tool for the study of protein structure and function, as the magnetic moments of unpaired electrons generate long-range paramagnetic effects in nuclear magnetic resonance (NMR) spectra. Among the paramagnetic effects that can be observed in NMR spectra, pseudocontact shifts (PCSs) generated by paramagnetic metal ions stand out for their high information content and ease of observation (Otting, 2008;Parigi and Luchinat, 2018). Specifically, the PCSs provide information about the location of nuclear spins relative to the magnetic susceptibility anisotropy tensor ( χ tensor) associated with a paramagnetic metal ion, and this information can readily be obtained for nuclear spins as far as 40 Å from the paramagnetic centre (Bertini et al., 2001).
A common problem of lanthanide tags arises from mobility of the metal-ion complex relative to the target protein.
Paramagnetic lanthanide ions always generate paramagnetic relaxation enhancements (PREs) in the protein, which vary relatively little with minor movements of the metal ion. In contrast, PCSs can decrease dramatically if the lanthanide complex reorientates relative to the protein. With a limited degree of tag flexibility, the PCSs may still be explained by a single effective χ tensor, although, in principle, a family of χ tensors would be required to account for multiple tag conformations (Shishmarev and Otting, 2013). Wellimmobilized metal ions thus deliver not only larger PCSs, but also more reliable χ-tensor fits.
Different strategies have been devised to immobilize lanthanide ions on proteins. Tag motions can be restricted by short tethers and bulky lanthanide complexes to hem in the tag sterically (Nitsche and Otting, 2017). Double-arm tags provide two attachment points (Keizers and Ubbink, 2011), but even these designs have shown signs of tag mobility (Hass et al., 2010). A lanthanide-binding peptide (LBP) engineered into polypeptide loops of protein structures can deliver good metal immobilization but presents a major modification of the target protein (Barthelmes et al., 2011(Barthelmes et al., , 2017. Fusions of an LBP combined with disulfide bond formation have also been explored but do not necessarily achieve good immobilization of the lanthanide ion (Saio et al., 2009(Saio et al., , 2010(Saio et al., , 2011. A successful strategy has been a design where two neighbouring cysteine residues are furnished with metalchelating tags and a single lanthanide ion is coordinated by both chelating groups (Swarbrick et al., 2011;Welegedara et al., 2017), a design that has also proven successful for Co 2+ ions (Swarbrick et al., 2016). The most serious drawback of this design is its reliance on cysteine residues, which makes it incompatible with proteins that contain functionally important cysteine residues in their wild-type sequence. In fact, most of the currently available lanthanide tags target cysteines Keizers and Ubbink, 2011;Nitsche and Otting, 2017;Joss and Häussinger, 2019;Su and Chen, 2019;Saio and Ishimori, 2020), as thiol groups can readily undergo selective chemical reactions. To avoid the mobility of solvent-exposed cysteine side chains, tags have also been designed for attachment to the side chains of aromatic residues, which are more hindered sterically and thus discouraged from populating different rotamers (Loh et al., 2015;Abdelkader et al., 2016), but this approach results in long linkers between the lanthanide ion and the protein, increasing the chances that the lanthanide ion moves and reorientates relative to the protein backbone.
The most elegant strategy for generating a lanthanidebinding site in a protein would be to introduce a lanthanidebinding unnatural amino acid that can be site-specifically incorporated by genetic encoding. This approach would relieve any reliance on cysteine residues. Although systems for genetic encoding have been devised for over 100 different unnatural amino acids, only a few of these can bind metal ions (Dumas et al., 2014), and those that do were found to precipitate proteins upon binding lanthanide ions. For example, protein precipitation has been reported for 2amino-3-(8-hydroxyquinolin-3-yl) propanoic acid (HQ-Ala; Jones et al., 2009), and we found ourselves incapable of improving on these results. Similarly, bipyridyl-alanine (Bpa) was shown to allow binding of Co 2+ and the observation of PCSs (Nguyen et al., 2011), but subsequent experiments with Bpa incorporated in different proteins and at different sites showed that this system is also prone to precipitating proteins upon addition of metal ions.
In the present work, we explored the potential of a different unnatural amino acid, phosphoserine (Sep), to create a lanthanide-binding site. Lanthanide ions are known for their affinity for negatively charged oxygens and, with a pK a value of 5.6 for the equilibrium between monobasic and dibasic forms (Xie et al., 2005), a phosphoserine residue carries two negative charges under physiological conditions. Phosphorylation of serine residues is a well-known post-translational modification of proteins carried out by kinases, but this often is neither quantitative nor easily achievable for specific serine residues. Recently, however, an orthogonal phosphoseryl-tRNA-synthetase/tRNA pair has become available, which allows installation of a Sep residue in response to an amber stop codon (Lee et al., 2013;Pirman et al., 2015;Yang et al., 2016). In the following we show that the system is sufficiently effective at installing two Sep residues in the same protein. In addition, we explore the potential to create a lanthanide-binding site using a Sep residue in conjunction with other negatively charged residues, in particular an aspartate or a second Sep residue and demonstrate the exceptional quality of χ tensors that can be obtained with lanthanide ions in these sites.

Plasmid preparation for protein expression
The plasmid SepOTSλ, which contains the phosphoseryl-tRNA synthetase/tRNA pair and a suitable EF-Tu mutant for incorporation of Sep in response to an amber stop codon (Pirman et al., 2015), was obtained from Addgene. To create a T7 expression vector that is compatible with SepOTSλ, we subcloned the region containing the T7 promoter, ribosome binding site, multiple cloning site and T7 terminator from pETMCSIII (Neylon et al., 2000) into the plasmid pCDF (Lammers et al., 2014). The gene of interest was inserted into the multiple cloning site and furnished with a C-terminal His 6 -tag preceded by a TEV cleavage site. All plasmid constructions were conducted with a QuikChange protocol using mutant T4 DNA polymerase (Qi and Otting, 2019).

Protein expression
All proteins were expressed in the BL21 serB strain (Park et al., 2011), which lacks phosphoserine phosphatase and thus minimizes the dephosphorylation of phosphoserine to serine. The SepOTSλ and pCDF plasmids were co-transformed into electrocompetent BL21 serB cells. In order to minimize usage of amino acids and 15 NH 4 Cl, the following top-down expression method was used. Initially, 1 L of cell culture was grown in LB medium with 25 µM spectinomycin and 20 µM kanamycin at 37 • C until the OD 600 value reached 0.6-0.8. Next, the cells were pelleted and resuspended in 300 mL M9 medium (6 g L −1 Na 2 HPO 4 , 3 g L −1 KH 2 PO 4 , 0.5 g L −1 NaCl) and supplied with 1 g L −1 15 NH 4 Cl and 1 mM phosphoserine. Subsequently, the cells were incubated for 30 min at 37 • C and induced with IPTG. Protein expression was conducted at 25 • C overnight. Cells were harvested by centrifugation at 5000 g for 15 min and lysed by passing twice through a French press (SLM Amicon, USA) at 830 bar. The lysate was then centrifuged at 13 000 g for 60 min and the filtered supernatant was loaded onto a 5 mL Ni-NTA column (GE Healthcare, USA) equilibrated with binding buffer (50 mM Tris-HCl, pH 7.5, 300 mM NaCl, 5 % glycerol). The protein was eluted with elution buffer (binding buffer containing, in addition, 300 mM imidazole). For the doubleamber mutants, the protein was dialysed into TEV protease buffer (50 mM Tris-HCl, pH 8.0, 300 mM NaCl and 1 mM βmercaptoethanol) to remove the His 6 -tag by digestion with TEV protease overnight at 4 • C. His 6 -tagged TEV protease was added in a 0.1 molar ratio. The protease and cleaved His 6 -tag were removed by running the sample again over a Ni-NTA column. The resulting protein samples were then treated with 5 mM EDTA to remove any di-or tri-valent metal ions that could have been adsorbed during protein expression and purification. Finally, EDTA was removed by buffer exchange with the NMR buffer (20 mM HEPES-KOH, pH 7.0) using a HiPrep desalting column (GE Healthcare, USA). Mass-spectrometric analysis was conducted using an Elite Hybrid Ion Trap-Orbitrap mass spectrometer (Thermo Scientific, USA) coupled with an UltiMate S4 3000 UHPLC (Thermo Scientific, USA); 7.5 pmol of sample was injected into the mass analyser via an Agilent ZORBAX SB-C3 Rapid Resolution HT Threaded Column (Agilent, USA).

NMR spectroscopy
All NMR spectra were recorded at 25 • C, using an 800 MHz Bruker Avance NMR spectrometer for all mutants containing a single phosphoserine residue and a 600 MHz Bruker Avance NMR spectrometer for all mutants containing two phosphoserine residues. Samples were prepared in 20 mM HEPES buffer, pH 7.0, in 3 mm NMR tubes; 10 % D 2 O was added to provide a lock signal; 0.1-0.5 mM protein samples were used for 2D [ 15 N, 1 H]-HSQC experiments. Complexes with lanthanides were obtained by titration with 10 mM LnCl 3 stock solutions in water. The samples were titrated with paramagnetic lanthanides until the NMR spectra predominantly displayed cross-peaks of the paramagnetic species with minimal interference of cross-peaks from the diamagnetic species. Residual weak cross-peaks from the diamagnetic protein were accepted to avoid over-titration, which could lead to metal binding at additional sites. For the diamagnetic reference with Y 3+ ions, the reference samples were prepared with the same quantity of YCl 3 stock as the paramagnetic samples.

PCS measurements and ∆χ -tensor fitting
PCSs were measured in ppm as the difference in amide proton chemical shift between the paramagnetic and diamagnetic NMR spectra. PCSs were used to determine the position and orientation of the χ tensor of the paramagnetic ions relative to the protein structure. Fitting of χ tensors was performed using the program Paramagpy (Orton et al., 2020).

Isothermal titration calorimetry
Isothermal calorimetric titration experiments were performed using a Nano-ITC low-volume calorimeter (TA Instruments, USA) at 25 • C with stirring at 250 rpm. The protein mutant E18Sep and the titrants TbCl 3 and TmCl 3 were prepared in the same buffer (20 mM HEPES, pH 7.0) and degassed before use. Data were analysed using the programs NITPIC and SEDPHAT (Keller et al., 2012). The baseline-subtracted power peaks were integrated and the integrated heat values fitted to the single binding site model (A + B ↔ AB, heteroassociation) to obtain the dissociation constant (K d ). The global fitting was done by repeatedly cycling between Marquardt-Levenberg and simplex algorithms in SEDPHAT until modelling parameters converged; 68 % confidence intervals were calculated using the automatic confidence interval search with the projection method using F-statistics in SEDPHAT.

Phosphoserine incorporation
Simultaneous transfection of E. coli with the SepOTSλ plasmid and pET vectors containing the genes of proteins targeted for overexpression and Sep incorporation led to slow cell growth and variable colony sizes on plates as described earlier (Pirman et al., 2015). Noting that the SepOTSλ plasmid contains the origin of replication of pUC, which belongs to the same plasmid incompatibility group as pET vectors (Morgan, 2014), we constructed a new expression vector based on pCDF to include T7 promoter, ribosome binding site, multiple cloning site and T7 terminator. This modification restored the usual growth rates of the cells. Proteins containing phosphoserine were expressed from a two-plasmid  and Tm 3+ plotted in red and blue, respectively. Right panel: blue and red PCS isosurfaces plotted on the protein structure and indicating PCSs of ±1 ppm, respectively. The isosurfaces illustrate the χ tensors obtained with Tb 3+ (upper structure) and Tm 3+ (lower structure). The side chains of E16 and the phosphoserine residue in position 18 are shown in a stick representation. system containing SepOTSλ and a modified pCDF vector in E. coli BL21 serB. Expression yields of up to 3 mg purified protein per litre of growth medium were obtained.

Single phosphoserine residues for lanthanide binding
We used the proteins ubiquitin and GB1 to test whether a single phosphoserine residue is sufficient to create a lanthanidebinding site. We hypothesized that a phosphoserine residue assisted by an additional carboxyl group from a glutamate or aspartate residue (in the following referred to as "helper residue") could potentially be sufficient to generate a tridentate complex with a lanthanide ion, positioning the metal ion close to the protein backbone and compensating its positive charge. In the first example, we made the mutant E18Sep of ubiquitin, where Glu16 and Asp21 could act as potential helper residues. Subsequent titration with Tb 3+ ions suc-ceeded in generating PCSs of up to almost 1 ppm (Table S1). The paramagnetic peaks appeared at chemical shifts different from the diamagnetic parent peaks, indicating slow exchange between lanthanide-bound and free protein. Isothermal calorimetric experiments with Tb 3+ and Tm 3+ ions indicated dissociation constants of about 30-50 µM (Fig. S1). Figure 1a shows the PCSs observed with Tb 3+ and Tm 3+ ions. Using the NMR ensemble structure of ubiquitin (PDB ID: 2KOX; Fenwick et al., 2011) and the measured PCSs, the metal position was determined by fitting the χ tensor using the program Paramagpy (Table 1; Orton et al., 2020). The correlation between back-calculated and experimental PCSs was excellent (Fig. 2a), resulting in Q factors lower than 0.04. This indicated that the Sep residue and lanthanide complex did not alter the structure of the protein. Furthermore, the tensor fit positioned the lanthanide ion between the phosphoserine residue and Asp21, suggesting that Asp21 acts as a helper residue rather than Glu16. To verify this result, we prepared the two ubiquitin mutants E18Sep/E16Q and E18Sep/D21N. As expected, the former delivered similar PCSs (Fig. 1b, Table S1), a similarly good χ-tensor fit (Table 1) and a similar metal position, whereas the latter showed only very small chemical shift changes upon titration with lanthanides, indicating a faster exchange (Fig. 1c). The paramagnetic centre identified by the fits placed the lanthanide ions between the aspartate and Sep residues as expected (Fig. 2b).

Phosphoserine and aspartate for introducing a lanthanide-binding site into GB1
The scheme of combining a phosphoserine with an aspartate helper residue to create a lanthanide-binding site was also successful with the GB1 mutant K10D/T11Sep, where Tb 3+ ions generated PCSs as large as 0.55 ppm (Fig. 3a, Table S2) and, as for the ubiquitin mutants, the lanthanide complex was in slow exchange with the free protein. The χtensor fit produced an excellent correlation between backcalculated and experimental PCSs with a Q factor of 0.01, indicating good immobilization of the lanthanide ion (Fig. 3c, Table S2). The best fit of the χ tensor positioned the lanthanide between the phosphoserine and aspartic acid residues as expected (Fig. 3e).

Double-phosphoserine motifs in GB1
Next we assessed the possibility of generating a lanthanidebinding motif by the introduction of two phosphoserine residues. For comparison with the GB1 mutant K10D/T11Sep, a double-amber mutant of GB1 was made to replace both Lys10 and Thr11 with phosphoserine. The protein was obtained in good yield (1.5 mg from 1 L of cell culture) despite the presence of two amber stop codons. Successful double-amber suppression was confirmed by mass spectrometry (Fig. S2a). Following titration with Tb 3+ ions, we observed PCSs up to 1 ppm (Fig. 3a, Table S2). The χtensor fit indicated that the lanthanide ion binds between the Sep residues in positions 10 and 11 as expected and the agreement between back-calculated and experimental PCSs was excellent ( Fig. 3d and f). The very low Q factor associated with the χ-tensor fit (Table 2) demonstrates that the PCSs are adequately explained by a single χ tensor, indicating the absence of averaging between different tensors arising from translational movements of the paramagnetic centre.
In previous work, we reported that two nitrilotriacetic acid (NTA) tags attached to cysteine residues in positions i and i + 4 of an α-helix yielded larger PCSs with lanthanides than a single NTA tag combined with an acidic helper residue (Swarbrick et al., 2011). In view of this result, we also attempted to position two phosphoserine residues in positions i and i + 4 of the α-helix of GB1. About 1 mg of GB1 A24Sep/K28Sep was obtained from 300 mL cell culture, and the successful and complete incorporation of two Sep residues was confirmed by mass spectrometry (Fig. S2b).
Following titration with Tb 3+ ions, PCSs of up to 3 ppm were observed (Table S2). Figure 4a shows the PCSs observed with Tb 3+ and Tm 3+ ions following titration to lanthanide : protein ratios at which the diamagnetic peaks had completely or mostly disappeared. Excess lanthanide ions resulted in significant peak broadening, indicating weak binding of the excess lanthanide ions to less specific sites. The χ -tensor fits to the crystal structure of GB1 revealed relatively large χ tensors and a small Q factor ( Fig. 4b and Table 2), indicating good immobilization of the lanthanide ion. The paramagnetic centre identified by the fits placed the lanthanide ions between the two Sep residues as expected (Fig. 4c).

Double-phosphoserine incorporation into other proteins
To test the broader validity of double-phosphoserine motifs as lanthanide-binding sites, we generated double-amber mutants for double-phosphoserine incorporation in 16 different sites in four different proteins (Fig. S3). The doubleamber mutations were designed to position two phosphoserine residues in α-helices (positions i and i + 4), loops (positions i and i + 2), and β-strands (positions i and i + 2, as well as two positions located in parallel β-strands).
Among the constructs made of GB1, ubiquitin, E. coli PpiB, Zika virus NS2B-NS3 protease, and the N-terminal ATPbinding domain of Plasmodium falciparum Hsp90 (Hsp90-N), in vivo expression attempts produced protein only for two of the constructs, namely Hsp90-N S36Sep/D40Sep (where the phosphoserine residues are in an α-helix) and ubiquitin T66/H68 (where the phosphoserine residues are in a βstrand). All the other constructs failed to produce protein.
Disappointingly, neither Hsp90-N S36Sep/D40Sep nor ubiq-    uitin T66Sep/H68Sep displayed any PCSs upon titration with paramagnetic lanthanides. The difficulties in expressing most of the doublephosphoserine mutants were not due to expression into insoluble inclusion bodies, as we did not find the proteins in the insoluble fraction after cell lysis. As the read-through efficiency of amber stop codons has been reported to depend on neighbouring nucleotides (Pott et al., 2014), we tested the incorporation of Boc-lysine (BoK) to produce ubiquitin A28BoK/D32BoK, E. coli PpiB K25BoK/D29BoK and GB1 T51BoK/T53BoK, using a previously published pyrrolysyl-tRNA synthetase/tRNA pair (Bryson et al., 2017). All these proteins were expressed successfully (Fig. S4), demonstrating that the difficulty in expressing these mutants with two phosphoserine residues arises not simply from the difficulty in reading through two amber stop codons in the same gene. These observations suggest that too many negatively charged amino acids located in close proximity interfere with protein folding, making the protein prone to proteolytic degradation during overexpression. Likewise, the ubiquitin mutant A28Sep/D32Sep could not be overexpressed, whereas the single mutant A28Sep was produced in good yield. Unfortunately, ubiquitin A28Sep did not display PCSs following titration with TbCl 3 (data not shown).

Lanthanide binding by three amino acid side chains
The high failure rate of double-phosphoserine incorporation prompted us to carefully assess the two GB1 double-Sep mutants that did express and deliver PCSs. Notably, both constructs feature an additional glutamate residue near the lanthanide-binding site, which could potentially assist with the binding of the lanthanide. Specifically, Glu26 is near the lanthanide-binding site of GB1 A24Sep/K28Sep (Fig. 5a), and the side chain of Glu56 is near the loop region harbouring the K10Sep/T11Sep mutations and could point towards the two phosphoserines in the loop (Fig. 5b). Indeed, the lanthanide positions determined by the χtensor fits are not simply between the two phosphoserine side chains, but potentially also within reach of the sidechain carboxyl groups of the nearby glutamate residues. The excellent Q factors associated with the χ -tensor fits (Table 2) suggest that the metal positions are reliable. Notably, none of the other double-phosphoserine mutants investigated (Fig. S3) provided the possibility of additional lanthanide coordination by a negatively charged helper residue. To test the functional importance of Glu26, we produced the GB1 A24Sep/K28Sep/E26N triple mutant and probed for lanthanide binding. Indeed, this mutant produced no PCSs upon titration with TbCl 3 .

Effect of salt bridges
In wild-type proteins, most aspartate and glutamate residues are located sufficiently close to positively charged side chains that they can engage in salt bridges. This raises the question whether such salt bridges can affect the lanthanidebinding affinity of sites constructed with negatively charged residues by compensating some of the negative charge. For example, the ubiquitin mutant T22Sep/N25D features a lysine residue (Lys29) in the α-helix harbouring Asp25, with the potential to form a salt bridge (Fig. 6a). To test the effect of this interaction, we replaced Lys29 by glutamine in the mutant T22Sep/N25D/K29Q. Indeed, while the mutant T22Sep/N25D displayed only very small PCSs with Tb 3+ ions if any (Fig. 7a), the mutant T22Sep/N25D/K29Q displayed PCSs up to 0.3 ppm (Fig. 7b, Table S1). Using the NMR ensemble structure of ubiquitin (PDB ID: 2KOX) and the measured PCSs, we determined the metal position in the triple mutant by fitting the χ tensor. The correlation be-tween back-calculated and experimental PCSs was excellent, resulting in a Q factor of 0.03 (Fig. 7c, Table 1).
Similarly, fitting of a χ tensor to the small PCSs observed for the ubiquitin mutant Q2D/E64Sep (Fig. S5), which has a lysine residue in position 63, suggested metal coordinates far from the protein, which is a hallmark of a variable metal position (Shishmarev and Otting, 2013). Unfortunately, the attempt to remove the potential salt bridge between Lys63 and the Sep residue in position 64 in the triple mutant Q2D/K63Q/E64Sep resulted in a construct that failed to express. Attempts to express the ubiquitin mutant R54Sep and the GB1 mutant K50Sep failed. We speculate that this may be due to the destabilizing effect associated with the disruption of salt bridges involving these sites (Fig. S6a). Conversely, the ubiquitin mutant T55Sep and the GB1 mutant A24Sep expressed in high yield but did not display PCSs upon titration with paramagnetic lanthanides. The structure of ubiquitin indicates that a Sep residue in position 55 could form a salt bridge with Arg54, and the structure of GB1 suggests that a Sep residue in position 24 could form a salt bridge with Lys28 (Fig. S6b). These results suggest that the expression even of highly stable proteins like ubiquitin and GB1 can be affected by the presence of too many charges in close proximity, while compensating the negative charge density by salt bridges affects lanthanide binding.

Protein unfolding due to charge repulsion
Our failure to produce most of the proteins designed with two phosphoserine residues in close proximity led us to hypothesize that low expression yields could in part be caused by unfolding due to electrostatic repulsion, which would increase susceptibility to proteolytic degradation during expression in E. coli. Supporting evidence came from two observations. First, the GB1 mutant K4D/I6Sep displayed an NMR spectrum characteristic of an unfolded protein (Fig. S7). In wildtype GB1, Glu15 is in close proximity to Lys4 (Fig. 6c). By disrupting this salt bridge, the mutant K4D/I6Sep contains several uncompensated negative charges in close proximity (Fig. 6d). Alternatively, Glu15 could also form a salt bridge with Lys13. Therefore we attempted to reduce the number of Figure 6. Single-site phosphoserine mutants tested for the effect of positively charged residues nearby. The locations of selected residues are highlighted by displaying side-chain atoms, with phosphoserine in magenta, aspartate in red, glutamate in orange and lysine residues in blue. Side-chain conformations are those of the crystal structure, except for phosphoserine, which was modelled. negative charges by producing the mutant K4D/I6Sep/E15Q. Unfortunately, this mutant failed to express.
The second piece of evidence of charge-driven unfolding came from phosphoserine mutants of Hsp90-N. Although the wild-type protein can be produced in good yield, the single-phosphoserine mutants K70Q/T71Sep, K70Q/N72Sep, K70Q/N72Sep/D69, N72Sep, Q54D/S57Sep and R98D/S99Sep (Fig. S8a) failed to express, and the mutants D88Sep/N91D, E162D/T163Sep and K160Q/E162D/T163Sep (Fig. S8b) were produced only in very low yields. Only the mutant N91Sep (Fig. S8c) expressed in sufficient yield for isotope labelling. Its [ 15 N, 1 H]-HSQC spectrum showed evidence of partial unfolding, as the signals of many amides vanished, while new peaks appeared at chemical shifts characteristic of unfolded proteins. Assignment of the well-resolved cross-peaks by comparison with the wild-type protein showed that the β-sheet of Hsp90-N was conserved in the N91Sep mutant, whereas no evidence was found of structural conservation of the protein region near residue 91 (Fig. S9). Notably, Hsp90-N is a protein of limited stability that is prone to precipitation and degradation within a couple of days.

Discussion
The present study shows the potential of phosphoserine to generate lanthanide-binding sites on proteins. Using phosphoserine to construct lanthanide-binding sites in proteins is uniquely attractive for multiple reasons. (i) Systems are available to genetically encode phosphoserine as an unnatural amino acid for site-specific insertion into polypeptide chains (Pirman et al., 2015). This provides facile access to the requisite protein mutants. The main alternative way, in which lanthanide ions can be attached to an unnatural amino acid, relies on copper-catalysed click chemistry of alkyne tags with a site-specifically introduced p-azidophenylalanine residue (Loh et al., 2013(Loh et al., , 2015. In our hands, about half of the proteins have proven to precipitate quantitatively when exposed to the copper catalyst. (ii) Phosphoserine allows us to construct the lanthanide-binding site without the need for post-translational modification by a lanthanide-binding chemical tag. Without the need for chemical modification, the approach is independent of the presence or absence of cysteine residues or whether the target protein tolerates the chemicals needed for specific tagging. (iii) The side chain of phosphoserine is relatively short, leading to a lanthanide position close to the protein backbone. This makes it easier to predict the position of the lanthanide ion relative to the protein. While a single phosphoserine residue is not sufficient to bind a lanthanide ion with high affinity, this study shows that a nearby aspartate residue can assist in forming a good lanthanide-binding site, with the lanthanide ion coordinated by both the phosphoserine and aspartate residues. This delivers a better localization of the lanthanide ion than most of the chemical tags designed for binding to cysteine residues and, hence, χ-tensor fits with very small Q factors can be obtained. The small size of the Q factors also indicates that the introduction of a phosphoserine residue does not induce any significant conformational changes in the target protein. High-quality χ -tensor fits open the door for exploiting PCSs as accurate long-range restraints in structural biology.
Exceptionally low Q factors were obtained for a lanthanide-binding site in GB1, which was made of two phosphoserine residues in positions i and i + 4 of the α-helix together with Glu15. The site also generated relatively large χ tensors, indicating excellent immobilization of the metal ion relative to the protein (Shishmarev and Otting, 2013) as well as full conservation of the 3D structure of the protein. Two phosphoserine residues in a loop region of GB1 also produced a very small Q factor. It was disappointing, however, that attempts to produce other proteins with two phosphoserine residues met with a high failure rate. This may be explained by a failure to fold due to many negatively charged residues located in close proximity (Baneyx and Mujacic, 2004), resulting in degradation of the proteins during expression. We succeeded in producing double-phosphoserine mutants of only two proteins other than GB1. These were the Hsp90-N mutant S36Sep/D40Sep and the ubiquitin mutant T66Sep/H68Sep. Both expressed in good yield but failed to produce PCSs with lanthanides. Furthermore, the absence of paramagnetic relaxation enhancements upon titration with lanthanides indicated the failure to bind. Inspection of the 3D structures of these proteins indicated that nearby residues with positively charged side chains were in positions capable of at least partially compensating the negative charges of the phosphoserine residues. The fact that the ubiquitin mutant T22Sep/N25D/K29Q produced much better PCSs than the mutant T22Sep/N25D (Fig. 7) illustrates the potentially detrimental effect of salt bridges on lanthanide binding.
The χ tensors obtained with Tm 3+ instead of Tb 3+ ions were unexpectedly low for the single-Sep mutants but not for the GB1 mutant A24Sep/K28Sep. We observed previously that the ratio between the χ ax components of these two ions can vary between different tags and even for the same tag at different sites of a protein (Loh et al., 2015). These differences are not an artifact of fitting the tensors for Tm 3+ and Tb 3+ independently, as, with the exception of ubiquitin E18Sep, the fits converged to very similar metal positions (Tables 1 and 2). We do not understand the origin of different magnitudes of χ-tensor anisotropies for Tm 3+ and Tb 3+ ions. In addition, much larger χ tensors have been reported for sterically rigid cyclen tags (Joss and Häussinger, 2019), suggesting that a rigid ligand field promotes large χ tensors.
Using tags preloaded with metal ions saves subsequent titration of the protein samples after tagging. In our titration experiments, using calculated amounts of lanthanide ions did not always deliver the optimal ratio of metal to protein, resulting in over-or under-titration. The most successful approach to establishing the paramagnetic complexes was operational, namely by titrating until the original signal of the protein had vanished or, at least, substantially decreased (to avoid over-titration and potential binding to other sites). Under-titration was not problematic, as the paramagnetic peaks are well resolved due to PCSs. Incomplete satu-ration with diamagnetic yttrium would mainly affect signals of amide protons near the metal-binding site, which are unobservable in the paramagnetic samples due to PREs.
In summary, when designing lanthanide-binding sites with phosphoserine residues, a single phosphoserine residue in combination with an aspartate can deliver binding affinities in the micromolar range, but positively charged side chains near the designed lanthanide-binding site can compromise its ability to bind lanthanides. At the same time, the difficulty in producing proteins that contain many negatively charged residues in close proximity points to the importance of salt bridges to ensure the structural integrity of proteins.

Conclusions
The present study demonstrates, for the first time, that a lanthanide-binding motif can be introduced into a protein via genetically encoded unnatural amino acids without further chemical modification. It is particularly promising that the lanthanide-binding motif can be generated in either an α-helix or a loop region by a single phosphoserine residue combined with an aspartate, provided these residues are not engaged in salt bridges. While two phosphoserine residues potentially bind lanthanide ions even more strongly, too many negatively charged residues in close proximity tend to severely affect the in vivo expression yields as well as the folding of the target protein. For proteins, where lanthanidebinding sites can successfully be installed with the help of phosphoserine residues, however, χ tensors of extraordinary quality can be obtained.
Author contributions. GO initiated the project and edited the final version of the manuscript. SMT performed most experiments and wrote the first version of the manuscript. MCM produced and analysed all protein mutants with two Sep residues. CTL established the phosphoserine incorporation protocols and performed the first successful PCS measurements with lanthanide ions. IA produced wild-type and mutant samples of Hsp90-N, provided NMR resonance assignments and analysed partially unfolded mutants.
Competing interests. The authors declare that they have no conflict of interest.

Special issue statement.
This article is part of the special issue "Robert Kaptein Festschrift". It is not associated with a conference.
Financial support. This research has been supported by the Australian Research Council (grant no. FL170100019) and the Centre of Excellence for Innovations in Peptide and Protein Science, Australian Research Council (grant no. CE200100012).
Review statement. This paper was edited by Rolf Boelens and reviewed by Claudio Luchinat and Marcellus Ubbink.