Development of Protein Based Bioremediation and Drugs for Heavy Metal Toxicity.

Fipure 13: Current state of the structural biology of the bacterial mercury detoxification system. The three-dimensional structures of mercuric reductase, merA (Schiering et al, 1991). and merP (Steele and Opella. 1997) are now known. This proposal is concerned with completion of the structure determination of merT. the membrane transport protein, and a naturally occurring variant merF, which are responsible for transferring Hg(II) from merP to the cell cytoplasm, the location of merA.


Development of Protein Based Bioremediation and Drugs for Heavy Metal Toxicity.
The research supported by DE-FG02-97ER62423 was focused on the structural biology of proteins that bind and transport heavy metals. In combination with the ongoing structural studies on the MerR repressor (Ansari et al, 1992). the structures of MerP, MerF, and MerT provide considerable insight into the chemistry and structural biology of the bacterial mercury detoxification system. The structural details of these proteins can provide insights into the entire detection, binding, and transport process. Just the ability to pose specific questions about chemical interactions and structural changes, which result from the initial structural studies in this field, represents a major advance in understanding these systems. The potential for using these proteins as agents for bioremediation is already being explored, including through their expression in plants for removal of heavy metals in soils (Rugh et al, 1996(Rugh et al, , 1998. . The signature CXXC amino acid sequence is found in a wide variety of metal-binding proteins, including rubredoxins, ferredoxins, and metallothioneins (Stillman, 1992). Further, it is part of the highly conserved GMTCXXC metal binding motif that occurs in many heavy metal binding and transport proteins; notably, MerP and MerA of the bacterial mercury detoxification system; HAHl , the human copper chaperone ; mbdl-6 and wbdl-6, the six metal binding domains of the human copper-transporting ATPases associated with Menkes (Vulpe et al, 1993) and Wilson's diseases (Bull et al, 1993), respectively; CadA, a cadmium transporting ATPase (Nucifora et al, 1989); Atxl, a yeast copper-transporting protein and its analog Ccc2 @ancis et al, 1994), CopA, a bacterial copper transporter (Solioz and Vulpe, 1996), and superoxide dismutase chaperone, a yeast copper and zinc transporting protein . These proteins adopt a compact PaPPaP fold and belong to the 'ferredoxinlike' structural family, which includes acylphosphatases, ferredoxins, and small DNA and RNA binding domains (Hubbard et al, 1997). Since the primary sequences and tertiary structures of all of these proteins are nearly identical, it is likely that subtle differences in the three-dimensional structures of the metalbinding sites are responsible for the substantial variations in affinities and selectivities that are observed.

NMR Structural Studies of MerP and homologous domains
We determined the structure of MerP, the prototypical metal binding protein with the characteristic GMTCXXC sequence in its metal binding site. The results have been published . MerP is a compact globular protein with 72 residues in a PaPPaP fold. As shown in Figure 3, the two a-helices lie on top of a four-strand antiparallel P-sheet. Significantly, the structural and dynamic differences between the reduced and mercury-bound forms of MerP are found only in residues proximate to the metal binding site. Sequences representative of the proteins and peptides homologous to MerP investigated in this research are compared in Figure 1. Subsequently, we extended the structural studies to a second "MerP-like" domain. The twodimensional 'H/"N heteronuclear multiple quantum correlation (HMQC) spectrum of a polypeptide corresponding to the first metal binding domain of the protein associated with Menkes disease (MNKI) is shown in Figure 2. This polypeptide corresponds to the circle next to the N-terminus of the ATPase shown in Figure 12. This spectrum displays essentially complete resolution among all amide resonances. The line widths in both dimensions are narrow, and there is excellent chemical shift dispersion because of the substantial amount of P-sheet in the protein. Since the resonances from nearly all backbone sites have been resolved and assigned for both the reduced and metal-bound forms of the protein, it is possible to identify those residues affected by metal binding by inspection-of the chemical shift changes DeSilva and Opella, 2001). This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employtts, makes any warranty, express or implied. or assumes any kgal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus product, or proass disclobed, or represents tbat its use would aot infringe privately owned rights. Reference herein to any spccific commercial product, proctss, or KNicc by trade name, trademark, manufacturer, or otherwise dots not necessarily constitute or imply its endorsement, m mmcnthtion. or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed henin do not necessarily state or reflect thosc of the United States Government or any agency thereof.

DISCLAIMER
Portions of this document may be illegible in electronic image products. Images are produced from athe best available original document. The results of several double-and triple-resonance three-dimensional experiments provided the resonance assignments noted in the spectrum in Figure 2. The triple-resonance HNCA, "COCA, and CBCACONH experiments assigned essentially all of the backbone resonances, and gave many chemical shift measurements. The Cp chemical shifts were particularly valuable in identifying the most probable type of amino acid associated with the C a and C p resonances. The summaries of the short-and mediumrange NOEs and other measurements clearly show that the MNKl domain (DeSilva and Opella, 2001), like MerP  and MNK4 (Gitschier et al, 1998) has two a-helices and four psheets.
The average structure of MNK1 is compared to that of MerP in Figure 3. We are in the process of adding orientational constraints from residual dipolar couplings measured from weakly aligned samples; this should be particularly useful for increasing the resolution in the metal binding loop where the NOEs are relatively sparse. The most notable features of the structure of this class of proteins are that the two helices lie above the plane of a four-strand p sheet, which has a slight left-handed twist common to many antiparallel p sheets. This twist is more pronounced in the last short strand. A p bulge breaks the sheet at the beginning of the second strand. The helices are roughly parallel and are oriented at an angle of about 15" relative to the axis of the p sheet. The interhelical angle is about 50". This follows the twist of the p sheet as the helices are packed against the sheet. The long loop connecting B1 and the first helix contains the metal binding site with the GMTCXXC sequence. In the mercury-bound form the two Cys residues lie above the loop toward the surface of the protein.

Fipure 3: The three-dimensional structures of merP-like proteins that we determined in solution by NMR spectroscopy. A . MNKl (DeSilva and Opella. 2001). B. MerP (Steele and Opella. 1997).
As mentioned abc-ve, only a limited number of resonances undergo substantial shifts when mercury is added to either of these proteins. It is notable that the residues associated with these resonances are near the metal binding loop in the three-dimensional structure of the proteins, although not necessarily in the sequence. The largest changes occur in the loop connecting strand B1 and helix H1, and part of the way into helix H1. Chemical shift changes also occur in the region between strands B2 and B3, directly below the metal binding loop. Smaller, but significant, shifts are observed for the residues connecting helix H2 and strand B4, which are near the binding site. The essential conclusion to be drawn from the two protein structures we determined DeSilva and Opella, 2001) is that the tertiary structure provides the framework for the metal binding loop with the GMTCXXC sequence. This finding led to the experimental studies on peptides containing only those residues that constitute the metal binding loop (Veglia et al, 2000).

Peptides Corresponding to the Metal Binding Loop of MerP-like Proteins
We synthesized, characterized the metal binding properties, and determined the three-dimensional structure of the 18-residue peptide TLAVPGMTCAACPITVKK (Veglia et al, 2000), which corresponds to residues 6 through 23 that constitute the metal binding loop of MerP. As expected, in the absence of metal ions, the 18-residue peptide does not appear to have a preferred conformation in solution; its CD spectrum has substantial negative intensity at 200 nm, characteristic of an unstructured polypeptide, and its two-dimensional ' W'H NOESY NMR spectrum contains only a few weak cross-peaks. However, dramatic spectral changes occur upon addition of mercury, indicating that the peptide folds into a highly stable, unique conformation when it binds a metal. The negative intensity in the CD spectrum is reduced; many 'H NMR resonances shift and become noticeably broader; and there is a substantial increase in the number and intensity of cross-peaks in the two-dimensional 'W'H NOESY NMR spectrum. These NOES provide the distance constraints used for determination of the three-dimensional structure of the peptide in solution.

A.
B.

T13
The average structure of residues 6 -12 of the peptide (corresponding to residues 11 -17 of MerP) is shown in Figure 4A. The structure of the same sequence in MerP is shown in Figure 4B for comparison. The backbone structure of the peptide is relatively well defined for those residues shown in Figure 4A, with an RMSD to the mean of 0.41 A for their alpha carbons. The N-terminal five residues of the peptide are unstructured in solution, although the corresponding residues (6 -10) are highly structured in MerP since they are sandwiched between the third and fourth P-strands. The three-dimensional structures of the crucial GMTCAAC residues bound to Hg(1I) are remarkably similar in the 18-residue peptide and the 72-residue protein. Notably, in both structures the two Cys side chains point towards the interior of the loop and coordinate the metal ion. In addition, the methyl group of Ala16 points toward the metal and the methyl group of Ala15 is located on the outside. As in MerP, the side-chain of Met12 is not close enough to be involved in metal binding. This is also the case for the proteins MNK4 (Gitschier et al, 1998) and Atxl (Rosenzweig et al, 1999), whose metal binding loops have sequences virtually identical to that of MerP. In the structures in Figures 4A and 4B, Thrl3 points away from the metal, ruling out possible interactions with its side chain oxygen. In contrast, the Thrl4 and Thrl3 side chains of Atxl and MNK4 are significantly closer to the metal binding site, approaching a distance consistent with secondary bonding interactions that may affect discrimination among metal ions. The metal binding loop of the peptide appears less extended than in MerP, possibly because the conserved hydrophobic residues proximate to the metal binding loop in the protein are absent in the peptide.
Hg NMR spectroscopy is ideal for elucidating the coordination geometry of peptides and proteins that bind Hg(II), since I9')Hg is a spin S = 1/2 nucleus with a chemical shift that is hi hly sensitive to its ligands and has an enormous range (>3000 ppm). Figure 5 contains directly detected '99Hg NMR spectra of Hg(I1) bound to several of the proteins being investigated. The chemical shift of Hg(I1) bound to MerP in aqueous solution is -816 ppm, well within the ran e observed for linear bicoordinate aliphatic thiolate compounds (Kubicki et al, 1981). In contrast, the ' 9Hg chemical shift observed for Hg(I1) bound to MerR, which is tricoordinate (3 Cys residues) is -106 ppm (Utshig et al, 1995), and this result correlates well with those for structurally characterized tricoordinate aliphatic thiolate compounds. Thus, the resonances in Figure 5 are from bicoordinate mercury bound to MerP, MNKI, MerT. and the 18-residue peptide corresponding to the MerP metal binding loop. Remarkably, the GMTCAAC sequence retains metal-binding specificity in an 18-residue linear peptide. There are only 1 -2 orders of magnitude difference in the metal binding affinities between the peptide and the protein from which the sequence was derived. This demonstrates that peptides with high affinities and specificities for metal ions can be designed based on the structures of the native metal binding proteins in solution. Details of the influence of more distant residues will be described in the proposed research, which will add to the sophistication of the design process. Nonetheless, based on our preliminary results, it is likely to turn out that only a limited number of residues are needed to have a selective chelating agent. This bodes well for the design of reagents capable of removing heavy metals from the environment as well as humans exposed to these toxins.

The structures of the membrane proteins MerF and MerT responsible for transporting Hg(I1) into the cytoplasm.
We applied the general approach that we are developing for determining the structures of membrane proteins to MerF and MerT (Opella et al, 1987;Opella, 1997). Micelle samples were used for solution NMR spectroscopy, and bilayer and bicelle samples for solid-state NMR spectroscopy. First we obtained obtain a general idea of the protein organization from the hydropathy plot and molecular dynamics simulations performed according to the simple protocol we developed (Tobias et al, 1993;. Second, we expressed and purified the polypeptide as a fusion protein in E. coli. We monitored the quality of the material with two-dimensional HSQC spectra in micelles. Third, we prepared samples for solid-state NMR experiments in bicelles and bilayers. Fourth, we resolved, assigned, and made spectroscopic measurements for structure determination by solution NMR in micelles. We also made a full set of relaxation measurements for analysis of molecular dynamics. Fifth, we performed multidimensional solid-state NMR experiments on the oriented samples to resolve and assign the resonances, and to measure the orientationally dependent spectra parameters. We then interpreted these results in terms of the protein structure.

a.) Express, characterize, and prepare samples of isotopically labeled MerF and MerT in lipids.
Many isolates of the bacterial mercury detoxification system have been examined (Osborn et al, 1997;Hobman and Brown, 1997). Morby et a1 (1995) found only conservative differences among MerT proteins from Gram-negative bacteria. However, MerT-like proteins from Gram-positive bacteria, including MerF, appear to have one less trans-membrane helix-and significant differences in the second set of Cys residues. We are particularly interested in the protein referred to as MerF from the plasmid pMJ100 (Hobman et al., 1994). There are strong homologies among the N-terminal sequences of all of the proteins, including the first pair of Cys residues. The locations of the Cys residues are believed to be I , crucial for all aspects of metal binding in these proteins (Morby et al, 1995;Hobman and Brown, 1996;Opella et al, 1997).
We have prepared milligram quantities of isotopically labeled MerF and MerT proteins,including uniform "N,uniform I3C and I'N,uniform *H and "N,uniform 'H,13C,and ISN,and selective "N. The successful preparation of these samples is a direct result of intensive efforts we have devoted to the molecular biology and preparative biochemistry of small membrane proteins over the past five years. As in other cases, we found that it is essential to express these hydrophobic membrane proteins as fusion proteins, relying on the properties of the fusion partner to keep them away from the cell membrane. All attempts at direct expression failed, even though these are bacterial proteins. The plasmids constructed for over-expression of MerF and'MerT in E. coli are diagrammed in Figure 6. Because of the need to prepare many different isotopically labeled samples by growing bacteria on minimal media, our approach to the overexpression of MerT is quite different from that of Hobman and Brown (1 996). We prepared a fusion of maltose binding protein and MerT using the pMal-c2 vector (New England Biolabs). The gene for MerT was amplified by PCR from the plasmid pHN6, which was kindly provided by N. Hamlett (Harvey Mudd College), and contains the entire mer operon. The resulting DNA fragment was purified from an agarose gel by electroelution and ethanol precipitation. Ligations were performed with T4 DNA ligase. The recombinant plasmid was transformed into competent E. Coli DH5a cells. The DNA sequence was confirmed by sequencing, then supercoiled plasmid isolated and retransformed into E. Coli BL21, which grow well in the minimal media used for isotopic labeling. The cells were grown at 37" C until the OD600 reached 0.8, then expression of the fusion protein was induced by the addition of IPTG. Cell growth was monitored for 3 hours until the OD600 leveled off around 2.0 for rich media and 1.6 for minimal media. The cells were disrupted using a French press and the lysate cleared by ultracentrifugation. The supernatant was diluted and applied to an amylose resin column. The column was washed and the hsion protein eluted with buffer made 10 mM in maltose and 0.1% in SDS. The cleavage reaction with factor Xa takes 1 -3 hours. Then the polypeptides are separated using sizeexclusion gel chromatography with Sephacryl-200. The successful purification of MerT protein is demonstrated with the analytical gel shown in Figure 18. Mass spectrometry, amino acid analysis, and limited N-terminal sequencing demonstrated that the single band corresponds to the correct protein.
We constructed the gene for MerF by synthesizing four overlapping single-stranded oligonucleotides followed by amplification with PCR. The PCR product was digested using the appropriate restriction enzymes then ligated using T4 DNA ligase into a similarly digested pHLV-ML vector. The recombinant plasmid was transformed into competent E. coli DH5a cells. After the sequence was confirmed, the correct plasmid was isolated and transformed into E. coli BL21 cells for growth on minimal media. The cultures were induced for 4 hours with IPTG after the cells were grown to an OD550 of 0.8. MerF protein was expressed with an attached His-Tag sequence (with 9 histidines) and a leader peptide that forms inclusion bodies. The cell lysate was cleared by ultracentrifugation; the inclusion bodies were separated from the cells, then resuspended in 6 M guanidine hydrochloride and loaded onto a nickel affinity column. The column was washed with 40 mM imidazole buffer, and then the fusion protein eluted with 500 mM imidazole buffer. After dialysis against water and lyophilization, the fusion protein was redissolved in formic acid and cleaved using CNBr. After dialysis the cleaved fragments were redissolved ( in SDS buffer and separated using size exclusion gel chromatography with Sephacryl S-200 resin on a Pharmacia FPLC system. MerF protein was then concentrated in the presence of 40 mM pmercaptoethanol and ImM PMSF; protein purity is demonstrated with the gel in Figure 7 and mass spectrometry.

2.) NMR Structural Studies of the Membrane Proteins MerT and MerF
The amino acid sequences of MerT from Tn501 and MerF from pMJ100 are compared in Figure 8

T T
The development of a general NMR approach to structure determination of membrane proteins is a major part of the research program. This effort is strongly supported through the Resource for Solid-state NMR of Proteins. The proposed research is concerned with applying these emerging NMR methods to MerF and MerT. We are determining the structures of several membrane proteins in parallel by solution NMR spectroscopy of lipid micelle samples and solid-state NMR spectroscopy of lipid bilayer samples (Opella, 1997). We have found that membrane proteins with less than about 150 residues can be studied by solution NMR methods in micelle samples. They are somewhat difficult to study because of the slow correlation times of the proteinlipid complexes, however, with extensive use of triple labeling with 'H, I3C, and "N and a high field spectrometer on highly optimized lipid micelle samples, nearly complete resolution and assignments can be achieved. Two-dimensional 'H/''N HSQC spectra of uniformly "N labeled MerF and MerT in lipid micelles are shown in Figure 9. These spectra are of very high quality for membrane proteins in micelles and show essentially complete resolution of all amide resonances, in spite of the relatively broad line widths and limited chemical shift dispersion in these highly helical proteins. These spectra enable relaxation measurements and three-dimensional experiments. Extensive efforts were required during the initial hnding period to optimize the expression and purification of the proteins and other sample conditions so that these spectra could be obtained reproducibly. These results are also important because they demonstrate the integrity of the polypeptides before they are placed in phospholipids for solid-state NMR experiments. Solution NMR spectroscopy is considerably more difficult for membrane proteins in lipid micelles than it is for globular proteins in aqueous solution. The use of triple-resonance three-dimensional experiments was essential to obtain the resonance assignments shown in Figure 9. Because of the problems resulting from the relatively slow correlation times, *H, I3C, and "N labeled proteins were required and highly optimized sample conditions. The backbone assignments enabled the secondary structure of MerF and MerT to be determined experimentally. The parameters available for characterizing the secondary structure are shown in Figure 10. Taken together all of the parameters are consistent in showing that MerF has three and MerT has four trans-membrane helical segments. Hydrophobic membrane proteins like MerT and MerF can be reconstituted into phospholipid bilayers and bicelles. Unoriented bilayer samples are useful in characterizing the dynamics of backbone and side-chain sites. The principal approach we are developing for determining the structures of membrane proteins relies on the spectroscopic properties of oriented lipid bilayer samples (Opella, 1997;Marassi et al, 1997;Opella et al, 1999).

A. MerF
The solid-state NMR spectra in Figure 9 demonstrate that we have been able to prepare uniformly "N labeled samples of MerF and to orient the protein both in phospholipid bilayers mechanically between glass plates and magnetically in bicelles. The results in Figures 10 and 11 serve to characterize the general features of the mercury membrane transport proteins. The combination of sequence analysis, along with relevant site-direct mutagenesis results indicating the importance of the Cys residues (Morby et al, 1995), hydropathy analysis, and preliminary NMR results give working models for MerT and MerF. shown in Figure 10. These models differ in several major features from those discussed in the literature (Hobman and Brown, 1997). However, we do not feel that it is worthwhile to examine these differences in detail now, since the main purpose of the proposed research is to determine the three-dimensional structures of these proteins, and eliminate the need to rely on modeling. The correct structures and topology in the lipid bilayers will become apparent during the course of this research.

Structural Biology of the Bacterial Mercury Detoxification System
The current state of the structural biology of the bacterial mercury detoxification system is illustrated in Figure 13. The planned research is designed to complete the structures of the membrane transport proteins and to probe the metal binding sites of all of the proteins in great detail. This will enable us to describe the interactions between the protein that first binds the heavy metal on the outside of the cell and the membrane transport protein that takes it into the cell cytoplasm for reduction. R I T I P I F I A I D Fipure 13: Current state of the structural biology of the bacterial mercury detoxification system. The three-dimensional structures of mercuric reductase, merA (Schiering et al, 1991). and merP ) are now known. This proposal is concerned with completion of the structure determination of merT. the membrane transport protein, and a naturally occurring variant merF, which are responsible for transferring Hg(II) from merP to the cell cytoplasm, the location of merA.