Site-selective generation of lanthanoid binding sites on proteins using 4-fluoro-2,6-dicyanopyridine

Abstract The paramagnetism of a lanthanoid tag site-specifically installed on a protein provides a rich source of structural information accessible by nuclear magnetic resonance (NMR) and electron paramagnetic resonance (EPR) spectroscopy. Here we report a lanthanoid tag for selective reaction with cysteine or selenocysteine with formation of a (seleno)thioether bond and a short tether between the lanthanoid ion and the protein backbone. The tag is assembled on the protein in three steps, comprising (i) reaction with 4-fluoro-2,6-dicyanopyridine (FDCP); (ii) reaction of the cyano groups with α -cysteine, penicillamine or β -cysteine to complete the lanthanoid chelating moiety; and (iii) titration with a lanthanoid ion. FDCP reacts much faster with selenocysteine than cysteine, opening a route for selective tagging in the presence of solvent-exposed cysteine residues. Loaded with Tb3+ and Tm3+ ions, pseudocontact shifts were observed in protein NMR spectra, confirming that the tag delivers good immobilisation of the lanthanoid ion relative to the protein, which was also manifested in residual dipolar couplings. Completion of the tag with different 1,2-aminothiol compounds resulted in different magnetic susceptibility tensors. In addition, the tag proved suitable for measuring distance distributions in double electron–electron resonance experiments after titration with Gd3+ ions.

Correspondence to: Gottfried Otting (gottfried.otting@anu.edu.au) The copyright of individual parts of the supplement might differ from the article licence.S1.PCSs of backbone amide protons generated with TbCl3 and TmCl3 in GB1 Q32C with DCP-Cys2 tags of different chirality Table S2.PCSs of backbone amide protons generated with TbCl3 and TmCl3 in GB1 Q32C with DCP tags reacted with penicillamines of different chirality Table S3. 1  Table S4.Nucleotide sequences of the genes of ERp29 with TEV site preceding a C-terminal His6 tag, GB1 preceded by a MASMTG tag and followed by a TEV site and C-terminal His6 tag, and list of mutation primers used

Synthesis of (S)-3-amino-4-mercaptobutanoic acid hydrochloride (b-cysteine, 5)
A synthesis of b-cysteine was described by Birkofer and Birkofer (1956) without spectroscopic characterisation data of the final compound.Below is a detailed description of the synthetic route chosen in the present work.
The residue was dissolved in EtOAc (50 mL) and anhydrous HCl (2 M in ether; 6 mL) was added.The volatiles were removed under reduced pressure and the HCl treatment was repeated 3 times.The glue-like residue was suspended in EtOAc (30 mL) and sonicated 30 minutes which promoted crystallization of the amorphous substance.The reaction product was collected by centrifugation and it was washed with additional EtOAc (2 x 5 mL).After drying under vacuum 1.05 g (91 %) of white microcrystalline solid was obtained.

Synthesis of DCP-(L-Cys)2
2,6-pyridinedicarbonitrile (41.0 mg, 0.317 mmol) and L-cysteine hydrochloride (100 mg, 0.634 mmol) were added to 2 ml MeOH/water (1:1).The pH was adjusted to 7 with KOH and the mixture was stirred for 3 days.The solvent was removed under reduced pressure and the crude product was purified using a 10 g silica gel cartridge (5-20 % water:EtOH) to yield the title compound (44.2 mg, 41 %) as an off-white crystalline powder.Thr53 -13.9 Val54 -9.9 Thr55 -9.9 Glu56 -2.9         Table S4.Nucleotide sequences of the genes of ERp29 with TEV site preceding a C-terminal His6 tag, GB1 preceded by a MASMTG tag and followed by a TEV site and C-terminal His6 tag, and list of mutation primers used.
Figure S2.NMR spectra of 0.6 mM DCP-(L-Cys)2 in D2O at 25 o C. Spectra recorded on a 600 MHz NMR spectrometer.(a) 1 H-NMR spectrum with presaturation of the HDO signal.Small signals at about 7, 5.4 and 2.9 ppm are from a minor conformer that is in chemical exchange with the main species.(b) [ 13 C, 1 H]-HSQC spectrum.Positive (black) and negative (red) crosspeaks are from CH2 and CH groups, respectively.

Figure S3 .
Figure S3.Deconvoluted intact protein mass spectra of uniformly 15 N-labelled GB1 Q32C.Left panel: before tagging reaction with FDCP.Centre panel: after reaction with FDCP.Right panel: after the complete assembly of the DCP-(L-Cys)2 tag on the protein (expected mass increase 336 Da).

Figure S4 .
Figure S4.Deconvoluted intact protein mass spectra of calmodulin K148U.Left panel: before reaction with FDCP.The calculated mass is 16785.30Da.Right panel: after the reaction.The reaction was carried out at 25 °C for 10 minutes.The expected mass increase is 128 Da.

Figure S5 .
Figure S5.Deconvoluted intact protein mass spectra illustrating the reactivity of the FDCP tag with cysteine residues of different solvent exposure.The spectra in the left and right panels show the whole-protein mass spectra before and after reaction with the tag, respectively.The expected mass increase upon addition of a single DCP tag is 128 Da.(a) E. coli PpiB.The protein contains two buried cysteine residues.The calculated mass of the untagged protein is 18976.32Da.(b) N-terminal domain of P. falciparum Hsp90.The protein contains a single cysteine residue with limited solvent exposure.Calculated mass (without tag): 27016.40Da.(c) Rat ERp29.The protein contains a single cysteine residue with partial solvent exposure.Calculated mass (without tag): 27415.20 Da.(d) Rat ERp29 G147C/C157S.The protein contains one highly solvent-exposed cysteine residue.Calculated mass (without tag): 26563.28Da.(e) SARS-2 main protease (M pro ).The protein contains 12 cysteine residues, three of which are partially solvent exposed (including the active-site residue C145).Calculated mass (without tag): 33851.55Da.(f) Intracellular domain of p75 NTR .The protein contains two highly solventexposed cysteine residues.Calculated mass (without tag): 16606.23 Da.

Figure S6 .
Figure S6.Deconvoluted intact protein mass spectra showing the reaction of the FDCP tag with cysteine residues in the protein (expected mass increase per DCP tag is 128 Da) and subsequent reaction of each DCP tag with two cysteine molecules (expected mass increase per DCP tag is 208 Da).Left panel: before reaction with the tag.Calculated masses are 42648.53and 26591.96Da respectively.Centre panel: after reaction with the FDCP tag.Right panel: after reaction with excess free cysteine to complete the metal binding tag.(a) MBP T237C/T345C.(b) ERp29 G147C/C157S.

Figure S7 .
Figure S7.UV/Vis absorption spectrum of DCP-(L-Cys)2.The spectrum was recorded of the DCP-(L-Cys)2 fraction during a HPLC-MS run, using an Agilent mass spectrometer equipped with a reverse-phase column, a gradient from 5 % MeOH:water to 90 % MeOH:water in the presence of 0.1 % TFA and a temperature of 30 o C. Using a separate sample of DCP reactedwith excess L-cysteine, the molar extinction coefficient e at 280 nm was determined to be 6850 M -1 cm -1 .A sample of DCP reacted in the same way with excess L-penicillamine yielded e280 = 5400 M -1 cm -1 .

Figure S10 .
Figure S10.Distance distributions of Fig. 4 analysed by DeerNet as implemented inDeerAnalysis2022(Worswick et al., 2018), with colour coding of the reliability regions as defined in DeerAnalysis(Jeschke et al., 2006), corresponding to the DEER evolution time used (green: the shape of the distance distribution is reliable.Yellow: the mean distance and distribution width are reliable.Orange: the mean distance is reliable.Red: long-range distance contributions may be detectable but cannot be quantified).The solid lines represent the distributions with the best r.m.s.d.from the experimental data and the striped regions represent the variation of alternative distributions (±2 times the standard deviation) obtained by varying the parameters of the background correction and noise as calculated by the validation tool in the DeerAnalysis software package.The parameter ranges used for the validations were the default ones: white noise 0-1.5, background start 0.2*tmax-0.6*tmax,and background dimension 3-3.6.

Figure S12 .
Figure S12.Distance distribution of MBP T237C/T345C (Fig. 5 of the main text) analysed by DeerNet (Worswick et al., 2018) with colour coding of the reliability regions as defined in Fig. S10.

Figure S13 .
Figure S13.Conformation of the DCP-(β-Cys)2-Gd tag attached to cysteine as used for modelling distance distributions.Dihedral angles χ of rotatable bonds are labelled.Blue, red, yellow and magenta balls identify atoms of nitrogen, oxygen, sulfur and gadolinium, respectively.The conformation was modelled using ChemDraw, which placed the metal ion 1.9 Å from the pyridine nitrogen and 2.3 Å from the thiazoline nitrogens.

Figure S16 .
Figure S16.Diffusion experiment of DCP-(L-Cys)2 in the presence of YCl3.The spectrum was recorded of a 0.3 mM solution of DCP-(L-Cys)2 in 10 mM HEPES buffer (pH 7) in D2O with 0.5 equivalents of YCl3, using a 800 MHz NMR spectrometer.The pulse sequence used a stimulated echo with bipolar gradients (Bruker pulse program stebpggp1s191d).Each pulsed field gradient was 2 ms and the duration of the diffusion delay D was limited to 10 ms to prevent chemical exchange from equilibrating the peak intensities.The spectra plotted with solid and dashed lines were recorded with weak (0.5 Gauss/cm) and strong (50 Gauss/cm) gradients, the latter resulting in about 11-fold signal attenuation.The loss in sensitivity was compensated by recording the attenuated spectrum with 8192 instead of 1024 scans.Scaling of the peaks D1and D2 for closest superimposition, the signals of the free ligand (F1 and F2) and the 1:1 complex (M1 and M2) are smaller with strong than weak gradients, indicating more rapid diffusion as expected, if D1 and D2 correspond to the 2:1 ligand-to-protein complex.The signal labelled with a star showed no exchange cross-peaks (Fig.S17) and is unassigned.A corresponding diffusion experiment conducted with GB1 Q32C tagged with DCP-(L-Cys)2 and titrated with 0.6 equivalents TmCl3 yielded no measurable difference in peak attenuation between diamagnetic and paramagnetic signals.

Figure S17 .
Figure S17.EXCSY spectrum of DCP-(L-Cys)2 in the presence of YCl3.The spectrum was recorded of the sample used to record the data of Fig. S16.The 1D NMR spectrum (see Fig. 6 of the main text) is shown at the top.The spectrum was recorded at 800 MHz with a mixing time of 25 ms, t1max = 40 ms, t2max = 160 ms.Negative contour levels associated with zeroquantum cross-peaks are plotted with dashed lines.Dividing the cross-peak intensities by the diagonal intensities and the mixing time yields an exchange rate between free and bound metal ion of about 10 s -1 .

Table S1 .
PCSs of backbone amide protons generated with TbCl3 and TmCl3 in GB1 Q32C with DCP-(Cys)2 tags of different chirality.Residue PCS exp /ppm Residue PCS exp /ppm Residue PCS exp /ppm Residue PCS exp /ppm

Table S2 .
PCSs of backbone amide protons generated with TbCl3 and TmCl3 in GB1 Q32C with DCP tags reacted with penicillamines of different chirality.Residue PCS exp /ppm Residue PCS exp /ppm Residue PCS exp /ppm Residue PCS exp /ppm

Table S3 .
1 DHN RDCs of backbone amides of GB1 Q32C with different DCP tags loaded with