PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY ELSEVIER Progress in Nuclear Magnetic Resonance Spectroscopy 38(2001)83-114 www.elsevier.nl/locate/pnmrs NMR studies of protein -DNA interactions N. Jamin“,F.Ton CEAINSTN, 91191 Gif sur Yvette Departement de Biologie, Universite d'Evry, bld F. Mitterand, 91025 Evry Cedex, france Received 1 June 2000 Contents 2. Overview of techniques 84 2.1. Labeling of DNA 2. Chemical shift changes 2.3. Hydrogen exchange rates 88 2.4. Isotope editing and isotope filterin 88 .5. Deuteration 2.6. Transverse relaxation-optimized spectroscopy (TROsY) 2.7. Long-range distance constraints dration 3. Selected applications 9222 3.1. The helix-turn-helix motif 3.1.1. Homeodomain 3. 1.2. Lac repressor headpiece 3.1.3. Trp repressor 3.1.4.Ets 03 3.1.5.Myb 3.2. Zinc fingers 3.2.1.TFIA 3.2.2.ADR1 3.2.3.GATA-1 3.2.4.GAGA 3.3. Minor groove-binding architectural proteins 00⑧ 3.3.2.LEF-1 3.3.3.HMG(Y) 111 * Corresponding author.Tel.:+33-1-69-08-96-38;fax:+33-1-69-08-57-53 E-mail address: nadege jamin@cea fr(N. Jamin). ont matter 2001 Elsevier Science B.v. All rights reserved. P:S0079-6565(00)00024-8
NMR studies of protein±DNA interactions N. Jamina,*, F. Tomab a CEA/INSTN, 91191 Gif sur Yvette Cedex, France b DeÂpartement de Biologie, Universite d'Evry, bld F. Mitterand, 91025 Evry Cedex, France Received 1 June 2000 Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2. Overview of techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2.1. Labeling of DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2.2. Chemical shift changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 2.3. Hydrogen exchange rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.4. Isotope editing and isotope ®ltering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 2.5. Deuteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2.6. Transverse relaxation-optimized spectroscopy (TROSY) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2.7. Long-range distance constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.8. Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 2.9. Hydration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3. Selected applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.1. The helix-turn-helix motif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.1.1. Homeodomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.1.2. Lac repressor headpiece . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.1.3. Trp repressor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.1.4. Ets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.1.5. Myb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 3.2. Zinc ®ngers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 3.2.1. TFIIIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 3.2.2. ADR1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 3.2.3. GATA-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.2.4. GAGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.3. Minor groove-binding architectural proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 3.3.1. SRY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 3.3.2. LEF-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 3.3.3. HMG-I(Y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 0079-6565/01/$ - see front matter q 2001 Elsevier Science B.V. All rights reserved. PII: S0079-6565(00)00024-8 www.elsevier.nl/locate/pnmrs * Corresponding author. Tel.: 133-1-69-08-96-38; fax: 133-1-69-08-57-53. E-mail address: nadege.jamin@cea.fr (N. Jamin)
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001)83-114 3.4. Recognition using B-sheet 3. 1. Tn916 integrase 3. 4.2. GCC-box binding domain 4. Perspectives 112 References 1. Introduction bound to a 14-mer duplex DNA containing the bs site [1] and the lac repressor headpiece(residues 1 Understanding at a molecular level, the mechan- 56, HP56)complexed with a 11-mer operator [2] isms for the control of genetic information and its This review will describe the use of nmr to obtain replication, packaging and repair necessitates the information on complexes of proteins with their speci elucidation of the detailed interactions between fic DNA targets. Most of the NMR techniques used to proteins and DNA. The last ten years have produced study protein-DNA interactions are also employed a large amount of structural information about for other type of protein complexes. Therefore, for a protein-DNA complexes from both X-ray crystallo- detailed description of the NMr techniques, the graphy and NMR. These data reveal the complexity of reader is referred to recent reviews [3-5]or to specific the DNA recognition process. The absence of a papers referenced in the text recognition code' is particularly evident among the divided in three parts. The first part is hree zinc fingers of the transcription factor TFIIIA as an overview of the NMR techniques commonly used homologue residues in different complexes do not to get information on protein-DNA interactions. It always contact corresponding base pairs. Direct inter- includes a brief description of DNA labeling techni- action between protein side-chains and DNa bases ques, the use of chemical shift or hydrogen exchange not only involve secondary structures like a-helix or changes to find the binding site, the use of hydrogen B-sheet but also flexible loops and arms. Moreover exchange or relaxation data to get dynamics informa residues not involved in specific interactions such as tion on the binding process, the use of the main the linker residues of the three zinc fingers domain of isotope filtering and editing techniques as well as TFIIA can be as important for the protein-DNA transverse relaxation-optimized spectroscopy to interaction as residues making contact with DNA assign the NMR signals, and newly developed tech- bases niques to deal with large complexes or to obtain long NMR makes its unique contribution to the under- range distance restraints. The second part comprises standing of protein-DNA interactions by highlighting applications of these techniques to different protein the dynamic aspects of protein-DNA interactions: DNA complexes. Protein-DNA complexes are clas dynamics of disorder-to-order transitions upon DNa sified according to the protein recognition motif: binding, dynamics at the protein-DNA interface, helix-turn-helix(HT), zinc finger, minor groove dynamics of opening and closing of base-pairs and, binding motif and B-sheet. Finally, the third part measurements of lifetimes of water molecules at th presents the future perspectives that can be inferred protein-DNA interface. from the emerging NMR techniques During the last 10 years, more than 20 structures of specific protein-DNA complexes and numerous data on protein-DNA interactions have been obtained by 2. Overview of techniques NMR thanks to the developments in protein and nucleic acid synthesis, in isotopic labeling techniques Protein-nucleic acids complexes are large entities and in heteronuclear magnetic resonance spectro- and the availability ofC-andN-labeled proteins copy. The first 3D NMR structures of a protein has made the determination of their solution structures DNA complex were obtained in 1993: the Drosophila attainable. Double and triple resonance spectroscopy antennapedia mutant homeodomain(Antp(C39S) facilitates the resonance assignments, the measurement
3.4. Recognition using b-sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 3.4.1. Tn916 integrase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 3.4.2. GCC-box binding domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4. Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 1. Introduction Understanding at a molecular level, the mechanisms for the control of genetic information and its replication, packaging and repair necessitates the elucidation of the detailed interactions between proteins and DNA. The last ten years have produced a large amount of structural information about protein±DNA complexes from both X-ray crystallography and NMR. These data reveal the complexity of the DNA recognition process. The absence of a `recognition code' is particularly evident among the three zinc ®ngers of the transcription factor TFIIIA as homologue residues in different complexes do not always contact corresponding base pairs. Direct interaction between protein side-chains and DNA bases not only involve secondary structures like a-helix or b-sheet but also ¯exible loops and arms. Moreover residues not involved in speci®c interactions such as the linker residues of the three zinc ®ngers domain of TFIIIA can be as important for the protein±DNA interaction as residues making contact with DNA bases. NMR makes its unique contribution to the understanding of protein±DNA interactions by highlighting the dynamic aspects of protein±DNA interactions: dynamics of disorder-to-order transitions upon DNA binding, dynamics at the protein±DNA interface, dynamics of opening and closing of base-pairs and, measurements of lifetimes of water molecules at the protein±DNA interface. During the last 10 years, more than 20 structures of speci®c protein±DNA complexes and numerous data on protein±DNA interactions have been obtained by NMR thanks to the developments in protein and nucleic acid synthesis, in isotopic labeling techniques and in heteronuclear magnetic resonance spectroscopy. The ®rst 3D NMR structures of a protein± DNA complex were obtained in 1993: the Drosophila antennapedia mutant homeodomain (Antp(C39S)) bound to a 14-mer duplex DNA containing the BS2 site [1] and the lac repressor headpiece (residues 1± 56, HP56) complexed with a 11-mer operator [2]. This review will describe the use of NMR to obtain information on complexes of proteins with their speci- ®c DNA targets. Most of the NMR techniques used to study protein±DNA interactions are also employed for other type of protein complexes. Therefore, for a detailed description of the NMR techniques, the reader is referred to recent reviews [3±5] or to speci®c papers referenced in the text. This review is divided in three parts. The ®rst part is an overview of the NMR techniques commonly used to get information on protein±DNA interactions. It includes a brief description of DNA labeling techniques, the use of chemical shift or hydrogen exchange changes to ®nd the binding site, the use of hydrogen exchange or relaxation data to get dynamics information on the binding process, the use of the main isotope ®ltering and editing techniques as well as transverse relaxation-optimized spectroscopy to assign the NMR signals, and newly developed techniques to deal with large complexes or to obtain longrange distance restraints. The second part comprises applications of these techniques to different protein± DNA complexes. Protein±DNA complexes are classi®ed according to the protein recognition motif: helix-turn-helix (HTH), zinc ®nger, minor groove binding motif and b-sheet. Finally, the third part presents the future perspectives that can be inferred from the emerging NMR techniques. 2. Overview of techniques Protein±nucleic acids complexes are large entities and the availability of 13C- and 15N-labeled proteins has made the determination of their solution structures attainable. Double and triple resonance spectroscopy facilitates the resonance assignments, the measurement 84 N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38(2001)83-114 Culture of cells withC carbon source [6]. Labeled ribonucleotides are prepared from the and Nitrogen sourc isolation of bacterial rna from in labeled medium, the hydrolysis of RNA and the separation phenolextraction of the ribonucleotides [7]. They are then chemi tized into nucleoside 3-phosphoramidites which DNA and RNA are used for preparing oligonucleotides on a DNA onthe Nucleic acids hydrolysis Using this method, a 14-base pair DNA dupl ully C, n doubly-labeled as well as partially 5 monophosphate nucleotide labeled at those nucleotides that form the prote DNA interface has been prepared to study its inter action with the antennapedia homeodomain [8] The general procedure for the production of uniformly C, N-labeled DNA by enzymatic synth- dNMPs rAMPs esis is described in Fig. 1. Zimmer and Crothers have shown that milligram quantities of material can be synthesized using this procedure [9]. Their method dNTP comprises the production of uniformly,N- labeled deoxynucleotides from enzymatic hydrolysis of the dNA of bacteria grown on 99%CH3OH and >98%NHCI as sole carbon and nitrogen sources DNA oligonucleotid The labeled DNa are then converted enzymatically to the triphosphates and used in a DNA polymerization I General procedure for the enzymatic synthesis reaction that utilizes an oligonucleotide hairpin labeled DNA primer-template containing a ribonucleotide at the 3 terminus. Alkaline hydrolysis of the ribonucleotide of coupling constants and of relaxation parameters not linkage between the labeled DNA and the unlabeled accessible by proton resonance spectroscopy. It is primer-template followed by purification yields the only recently that efficient labeling of DNA [6-9] labeled DNA. More recently variations of this method has been published thus opening applications of have been proposed by two other groups [10,11] heteronuclear spectroscopy to DNA. We will present Masse and coworkers [10] proposed three modifica- briefly the new labeling methods proposed for DNA. tions. First, the mixed dNTPs are separated from one We will also give an overview of the NMr techniques another so that the ratio of the four dNTPs correspond used to extract structural information about protei ing to the sequence of the deoxyoligonucleotide are DNA complexes including chemical shift changes, used in the reaction. Secondly, Taq polymerase is hydrogen exchange rates, isotope editing and filtering used instead of Klenow fragment of DNA polymerase echniques and methods for measuring protein I in the polymerization step. Third, an additional step dynamics to study the changes in protein flexibility is used to remove non-templated addition at the 3 upon binding. end. Louis and coworkers [11] used the same mole cule for the primer and template in the bidirectional 2.1. Labeling of dNA polymerase chain reaction thus obtaining an exponen- tial growth in the length of the double strand that matic methods. The chemical synthesis of DNA oligo- and coworkers [ll]. It comprises the growth of a mers involves the solid-phase phosphoramidite suitable plasmid containing mutiple copies of the
of coupling constants and of relaxation parameters not accessible by proton resonance spectroscopy. It is only recently that ef®cient labeling of DNA [6±9] has been published thus opening applications of heteronuclear spectroscopy to DNA. We will present brie¯y the new labeling methods proposed for DNA. We will also give an overview of the NMR techniques used to extract structural information about protein± DNA complexes including chemical shift changes, hydrogen exchange rates, isotope editing and ®ltering techniques and methods for measuring protein dynamics to study the changes in protein ¯exibility upon binding. 2.1. Labeling of DNA Large quantities of labeled DNA fragments for NMR studies can be synthesized by chemical or enzymatic methods. The chemical synthesis of DNA oligomers involves the solid-phase phosphoramidite method using isotopically labeled monomer units [6]. Labeled ribonucleotides are prepared from the isolation of bacterial RNA from cells grown in labeled medium, the hydrolysis of RNA and the separation of the ribonucleotides [7]. They are then chemically converted to deoxynucleotides and derivatized into nucleoside 30 -phosphoramidites which are used for preparing oligonucleotides on a DNA synthesizer. Using this method, a 14-base pair DNA duplex fully 13C,15N doubly-labeled as well as partially labeled at those nucleotides that form the protein± DNA interface has been prepared to study its interaction with the antennapedia homeodomain [8]. The general procedure for the production of uniformly 13C,15N-labeled DNA by enzymatic synthesis is described in Fig. 1. Zimmer and Crothers have shown that milligram quantities of material can be synthesized using this procedure [9]. Their method comprises the production of uniformly 13C,15Nlabeled deoxynucleotides from enzymatic hydrolysis of the DNA of bacteria grown on 99% 13CH3OH and .98% 15NH4Cl as sole carbon and nitrogen sources. The labeled DNA are then converted enzymatically to the triphosphates and used in a DNA polymerization reaction that utilizes an oligonucleotide hairpin primer-template containing a ribonucleotide at the 30 terminus. Alkaline hydrolysis of the ribonucleotide linkage between the labeled DNA and the unlabeled primer-template followed by puri®cation yields the labeled DNA. More recently variations of this method have been proposed by two other groups [10,11]. Masse and coworkers [10] proposed three modi®cations. First, the mixed dNTPs are separated from one another so that the ratio of the four dNTPs corresponding to the sequence of the deoxyoligonucleotide are used in the reaction. Secondly, Taq polymerase is used instead of Klenow fragment of DNA polymerase I in the polymerization step. Third, an additional step is used to remove non-templated addition at the 30 end. Louis and coworkers [11] used the same molecule for the primer and template in the bidirectional polymerase chain reaction thus obtaining an exponential growth in the length of the double strand that contains two repeats of the desired DNA sequence. An additional method has been presented by Louis and coworkers [11]. It comprises the growth of a suitable plasmid containing mutiple copies of the N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 85 Culture of cells with 13C carbon source and 15N nitrogen source cell lysis phenol extraction DNA and RNA proteins Nucleic acids hydrolysis 5’ monophosphate nucleotide nucleotide separation dNMPs rNMPs dNTPs DNA oligonucleotide Fig. 1. General procedure for the enzymatic synthesis of 13C,15Nlabeled DNA
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001)83-114 i.e. the dissociation constants Kd are less than 10M GR [R2R3] and detailed information can be obtained on the IMIM,I complexes because of the slow exchange regim between free and bound states (lifetimes greater than I s) at the chemical shift time-scale. The rate of exchange is much less than the difference in the chemical shift between the two states and. at a mole ratio less than the stoichiometric ratio, two sets of resonances are observed corresponding to the free and bound states. Therefore. the resonances of the complex have to be assigned using NMR techniques employed for large molecules and/or edited/filtered Fig. 2 shows the imino region of the H obtained upon addition of different amounts of ESapeolra tion of R2R3 dna binding domain of c-Myb to a solution of mim12 oligonucleotide [12]. On addition of the protein, new resonance lines corresponding to the bound mim12 dodecamer appear. Some of these lines are split into two signals which indicate the enc forms. The lifetimes of these two forms are longer than the inverse of the frequency difference between the free and bound state 14.0 12.0 Chemical shifts are very sensitive probes of the Fig.2. Imino region of the H-NMR 600 MHz spectra obtained local environments of the nucleus but unfortunately upon addition of different amour ding domain of c-Myb to a sol olton of the rzR DNA it is not possible to predict their values from the 20°C. conformation of the complex or conversely to deduce the conformation from their values. Nevertheless they are useful parameters to gain insight into the desired DNA sequence in E. coli with N andc parts of the molecules influenced by the interaction nutrients. These methods have been applied to the Schmiedeskamp and coworkers [13] have shown by synthesis of fully or partiallyC,N orN-labeled analysis of H and Ca chemical shifts that little double strand oligonucleotides of 10-21 base pairs. a change in the structure of the zinc-finger domain 32 base DNA oligonucleotide that folds to form an from the yeast transcription factor ADRI occurs intramolecular quadruplex as well as a 12 base oligo- upon binding to a 14mer DNA containing the UAS nucleotide that dimerizes and folds to form a quadru- half site. A correlation between the protein-DNA plex uniformly C, N-doubly labeled have also been interface mapped by chemical shift changes and that produced for NMR studies. by muta ts was found Both these methods require a high level of exper However, the identification of the dna binding site tise Site specific labeling is more easily attained with using DNA induced chemical shift changes should be the chemical method and is therefore the method of done with care. This approach is not feasible for hoice for the synthesis of site specific labeled DNA. numerous protein-DNA complexes where proteins undergo conformational transitions and dynamics 2. 2. Chemical shift changes changes upon binding that will affect the chemical shifts. This has been recently demonstrated by Foster Interactions of protein with DNA fragment contain- and coworkers [14]. These authors analyzed the corre ing specific binding sites are tight binding interactions lation between the chemical shift changes upon
desired DNA sequence in E. coli with 15N and 13C nutrients. These methods have been applied to the synthesis of fully or partially 13C,15N or 15N-labeled double strand oligonucleotides of 10±21 base pairs. A 32 base DNA oligonucleotide that folds to form an intramolecular quadruplex as well as a 12 base oligonucleotide that dimerizes and folds to form a quadruplex uniformly 13C,15N-doubly labeled have also been produced for NMR studies. Both these methods require a high level of expertise. Site speci®c labeling is more easily attained with the chemical method and is therefore the method of choice for the synthesis of site speci®c labeled DNA. 2.2. Chemical shift changes Interactions of protein with DNA fragment containing speci®c binding sites are tight binding interactions i.e. the dissociation constants Kd are less than 1028 M and detailed information can be obtained on the complexes because of the slow exchange regime between free and bound states (lifetimes greater than 1 s) at the chemical shift time-scale. The rate of exchange is much less than the difference in the chemical shift between the two states and, at a mole ratio less than the stoichiometric ratio, two sets of resonances are observed corresponding to the free and bound states. Therefore, the resonances of the complex have to be assigned using NMR techniques employed for large molecules and/or edited/®ltered techniques. Fig. 2 shows the imino region of the 1 H spectra obtained upon addition of different amounts of a solution of R2R3 DNA binding domain of c-Myb to a solution of mim12 oligonucleotide [12]. On addition of the protein, new resonance lines corresponding to the bound mim12 dodecamer appear. Some of these lines are split into two signals which indicate the simultaneous presence of two forms. The lifetimes of these two forms are longer than the inverse of the frequency difference between the free and bound state resonances. Chemical shifts are very sensitive probes of the local environments of the nucleus but unfortunately it is not possible to predict their values from the conformation of the complex or conversely to deduce the conformation from their values. Nevertheless, they are useful parameters to gain insight into the parts of the molecules in¯uenced by the interaction. Schmiedeskamp and coworkers [13] have shown by analysis of 1 H and 13Ca chemical shifts that little change in the structure of the zinc-®nger domain from the yeast transcription factor ADR1 occurs upon binding to a 14mer DNA containing the UAS half site. A correlation between the protein±DNA interface mapped by chemical shift changes and that mapped by mutagenesis experiments was found. However, the identi®cation of the DNA binding site using DNA induced chemical shift changes should be done with care. This approach is not feasible for numerous protein±DNA complexes where proteins undergo conformational transitions and dynamics changes upon binding that will affect the chemical shifts. This has been recently demonstrated by Foster and coworkers [14]. These authors analyzed the correlation between the chemical shift changes upon 86 N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 Fig. 2. Imino region of the 1 H-NMR 600 MHz spectra obtained upon addition of different amount of a solution of the R2R3 DNA binding domain of c-Myb to a solution of mim12 oligonucleotide at 208C
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38(2001)83-114 po Wild-type Holo Wild-type 02030405060708090100 102030405060708090100 LAHBH CHDHEHF [AHBHCHDHE HF S100Apo AV77 0orooxu 20 102030405060708090100 Residue Number Residue Numb Fig3. Amide proton exchange rate(s )versus residue number for the wild-type and AV77 apo- and holorepressor, pH 7.6 at 45C(Fi rom Ref [191). Reprinted with the permission of O. Jardetzky and of Cambridge University Press(O 1996) binding of the three aminoterminal zinc fingers of X. In the case of fast exchange between free and laevis TFIlla(zf1-3)to a 15-mer DNa with the inter bound states, the structure of the complex cannot molecular contacts known from the high-resolution be obtained easily. Titration experiments monitor structure of the complex. They found that the chemi- the variation of chemical shifts upon addition of cal shift changes for protein H, N and C reso- DNA and estimation of binding constants(in the nances upon DNa binding are not well correlated millimolar range) can be extracted from the with DNA contacts observed in the solution structure analysis of the titration curves [15]. The chemical of the complex. In fact the protein resonances are shifts of the bound protein resonances are directly affected not only by dna binding but also by changes obtained from these titration experiments. As in in the dynamics and conformation of the protein upon the case of slow exchange, the variation of binding. The DNA base-protons were found to be chemical shifts can be used to map the binding good markers of the DNa binding sites because the surface. conformation of the dNa is not significantly distorted For intermediate exchange between the free and upon binding bound states or between different bound conformations
binding of the three aminoterminal zinc ®ngers of X. laevis TFIIIA (zf1-3) to a 15-mer DNA with the intermolecular contacts known from the high-resolution structure of the complex. They found that the chemical shift changes for protein 1 H,15N and 13C resonances upon DNA binding are not well correlated with DNA contacts observed in the solution structure of the complex. In fact the protein resonances are affected not only by DNA binding but also by changes in the dynamics and conformation of the protein upon binding. The DNA base-protons were found to be good markers of the DNA binding sites because the conformation of the DNA is not signi®cantly distorted upon binding. In the case of fast exchange between free and bound states, the structure of the complex cannot be obtained easily. Titration experiments monitor the variation of chemical shifts upon addition of DNA and estimation of binding constants (in the millimolar range) can be extracted from the analysis of the titration curves [15]. The chemical shifts of the bound protein resonances are directly obtained from these titration experiments. As in the case of slow exchange, the variation of chemical shifts can be used to map the binding surface. For intermediate exchange between the free and bound states or between different bound conformations, N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 87 Fig. 3. Amide proton exchange rate (s-1) versus residue number for the wild-type and AV77 apo- and holorepressors, pH 7.6 at 458C. (Fig. 1 from Ref. [19]). Reprinted with the permission of O. Jardetzky and of Cambridge University Press (q 1996)
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001)83-114 broadening or disappearance of peaks occur prevent- the probability of opening the DNA helix. This may ing a detailed structural analysis play a role in processes that involve IHF and require opening of the double helix 2.3. Hydrogen exchange rates 2. 4. Isotope editing and isotope filtering As with chemical shifts. DNA-induced changes in hydrogen exchange rates can be used with care to map The general approach used to study molecular the dna binding site by comparing amide proton complexes involves uniform labeling of one compo- exchange rates of the free protein with those of the nent withN and/orC while the other component is protein-DNA complex [16, 17 unlabeled. Then isotope edited or isotope filtered Quantitative analysis of amide proton exchange experiments are selected to obtain information on rates provides insights into the stability and dynamics one component of the system. Isotope edited experi of the protein. Mau and coworkers [18]compared the ments detect proton signals attached toC/N nuclei amide proton exchange rates of three forms of th while isotope filtered experiments detect proton GALA transcriptional activator, the native Zn-contain- signals attached to C/N nuclei and remove ing protein, the Cd-substituted protein and a Zn-Gal4/CN attached proton signals [20-26] DNA complex. They showed that the Cd-substituted In the case of protein-DNA exes, the protein GAL4 is destabilized relative to the native protein as is generally uniformly doublyC,Labeled and th inferred from the slower exchange rates of the amide DNA is unlabeled Protein signals are assigned using proton of the native protein compared with the Cd 3D double and triple resonance experiments For the analogue. They observed a global retardation of DNA C-filtered NOESY and HOHAHA experi- amide proton exchange upon binding to DNA, in ments are implemented [22, 25, 26]. The intermolecu tion module are significantly reduced by the presence edited NOESY-HSQC experiments(23 4 F cating that internal fluctuations of the dNA-recogni lar NOEs are measured by 3DC Fl-filtered, F3 of dna The assignment of DNA signals is often difficult a Gryk and coworkers [19]ascribed the enhanced due to signal overlap especially for the deoxyribose repressor activity at the trp operator in vivo of the protons. Thus, labeled DNA will help to assign all the Val77 mutant of the Trp repressor to an increase in DNA resonances and to get more detailed conforma the stability of the flexible DNA binding domain of tional features for the DNA as well as to define more the Val77 mutant as deduced from the study of the precisely in some cases the interface between the amide proton exchange rates as shown in Fig. 3 protein and the DNA. The first example which The measurement of the imino proton exchange of makes use of C, N labeled DNA was published the dNA provides insights into the dynamic behavior by Masse and coworkers(Fig 4)[27]. These authors of the opening and closing rates of the base-pairs. studied the non-specific interaction between the High Dhavan and coworkers have analyzed the imino Mobility Group(HMG)-DNa binding domain of proton exchange in the Integration Host Factor NHP6A and a 15 base pair DNA. Three samples of F)-DNA complex [16]. This E coli DNA bindingC, N-labeled DNA were prepared: one strand protein is a minor groove binder and bends the dNa labeled, the other strand labeled and the two strands by greater than 140 at each site. They observed labeled. The majority of the base and deoxyribose large overall reduction in exchange rates for th DNA resonances in the complex were assigned by DNA in the complex. In the complex, groups of adj homonuclear techniques, but assignments of H4 cent base-pairs exchange at the same rate and appear H5 and H5 are particularly difficult and were to close more slowly than the rate of imino proton successfully made by using 3D H-C NOESY exchange with bulk water since their exchange rate HMQC and HCCH-TOCSY experiments on the is independent of catalyst concentration. Thus frag- three labeled protein-DNA samples. Unambiguou ments of the DNA as large as 6 base-pairs open in a assignments of intermolecular NOEs involving the cooperative manner and remain open much longer phosphodiester backbone were accomplished with than found for free DNA. Binding to IHF enhanced 3D double half-filtered H-C HMQC experiments
broadening or disappearance of peaks occur preventing a detailed structural analysis. 2.3. Hydrogen exchange rates As with chemical shifts, DNA-induced changes in hydrogen exchange rates can be used with care to map the DNA binding site by comparing amide proton exchange rates of the free protein with those of the protein±DNA complex [16,17]. Quantitative analysis of amide proton exchange rates provides insights into the stability and dynamics of the protein. Mau and coworkers [18] compared the amide proton exchange rates of three forms of the GAL4 transcriptional activator, the native Zn-containing protein, the Cd-substituted protein and a Zn-Gal4/ DNA complex. They showed that the Cd-substituted GAL4 is destabilized relative to the native protein as inferred from the slower exchange rates of the amide proton of the native protein compared with the Cd analogue. They observed a global retardation of amide proton exchange upon binding to DNA, indicating that internal ¯uctuations of the DNA-recognition module are signi®cantly reduced by the presence of DNA. Gryk and coworkers [19] ascribed the enhanced repressor activity at the trp operator in vivo of the Val77 mutant of the Trp repressor to an increase in the stability of the ¯exible DNA binding domain of the Val77 mutant as deduced from the study of the amide proton exchange rates as shown in Fig. 3. The measurement of the imino proton exchange of the DNA provides insights into the dynamic behavior of the opening and closing rates of the base-pairs. Dhavan and coworkers have analyzed the imino proton exchange in the Integration Host Factor (IHF)±DNA complex [16]. This E. coli DNA binding protein is a minor groove binder and bends the DNA by greater than 1408 at each site. They observed a large overall reduction in exchange rates for the DNA in the complex. In the complex, groups of adjacent base-pairs exchange at the same rate and appear to close more slowly than the rate of imino proton exchange with bulk water since their exchange rate is independent of catalyst concentration. Thus fragments of the DNA as large as 6 base-pairs open in a cooperative manner and remain open much longer than found for free DNA. Binding to IHF enhanced the probability of opening the DNA helix. This may play a role in processes that involve IHF and require opening of the double helix. 2.4. Isotope editing and isotope ®ltering The general approach used to study molecular complexes involves uniform labeling of one component with 15N and/or 13C while the other component is unlabeled. Then isotope edited or isotope ®ltered experiments are selected to obtain information on one component of the system. Isotope edited experiments detect proton signals attached to 13C/15N nuclei while isotope ®ltered experiments detect proton signals attached to 12C/14N nuclei and remove 13C/15N attached proton signals [20±26]. In the case of protein±DNA complexes, the protein is generally uniformly doubly 13C,15N labeled and the DNA is unlabeled. Protein signals are assigned using 3D double and triple resonance experiments. For the DNA 12C-®ltered NOESY and HOHAHA experiments are implemented [22,25,26]. The intermolecular NOEs are measured by 3D 13C F1-®ltered, F3 edited NOESY-HSQC experiments [23,24]. The assignment of DNA signals is often dif®cult due to signal overlap especially for the deoxyribose protons. Thus, labeled DNA will help to assign all the DNA resonances and to get more detailed conformational features for the DNA as well as to de®ne more precisely in some cases the interface between the protein and the DNA. The ®rst example which makes use of 13C,15N labeled DNA was published by Masse and coworkers (Fig. 4) [27]. These authors studied the non-speci®c interaction between the High Mobility Group (HMG)-DNA binding domain of NHP6A and a 15 base pair DNA. Three samples of 13C,15N-labeled DNA were prepared: one strand labeled, the other strand labeled and the two strands labeled. The majority of the base and deoxyribose DNA resonances in the complex were assigned by homonuclear techniques, but assignments of H40 , H50 and H500 are particularly dif®cult and were successfully made by using 3D 1 H±13C NOESYHMQC and HCCH-TOCSY experiments on the three labeled protein±DNA samples. Unambiguous assignments of intermolecular NOEs involving the phosphodiester backbone were accomplished with 3D double half-®ltered 1 H±13C HMQC experiments. 88 N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38(2001)83-114 repressor bound to a 20 base pair palindromic DNA operator)was determined by recording homonuclear G4T11 2D and 3D spectra for complexes with different T120 0 1 deuterium labeled trp repressor analogs as well as 140 heteronuclear spectra for complexes with uniformly SN, C-labeled trp repressor[29] 145 The use of perdeuterated protein in H2O (i.e. >90% "H incorporation at nonlabile positions and about 90% of labile positions protonated) led to the assignments A1400A7 of almost all backbone and C resonances of the 155 37 kDa trp repressor-operator DNA complex [30 and of a 64 kDa repressor-operator complex(two Indem dimers bound to a 22 base pair symmetric tryptophan)[31, 32 Samples of perdeuterated protein containing selec A7/A80 tive protonated or N,C, H labeled residues are HB/H6 A110 used to characterize specific contacts between the protein and the DNA. For example in the study of 13C the dna binding domain of the transcription factor NFATCI bound to a 12 base pair DNA, Zhou and coworkers [33] performed 2D H-H homonuclear 150 A11 NOESY experiment on complexes containing perdeuterated protein with fully protonated Tyr and 1551 Phe residues to characterize the contacts between Tyr ppm 442 and dnA. These authors also mentioned the use of site-specific deuteration at C2 of Ade6 to confirm the close proximity of Arg555 and Ade6 Fig 4 Portion of H-C HSQC spectra at 298K in D O, showi the correlations between aromatic protons and carbons of a 15 base pair DNA containing the binding site of NHP6. Upper spectrum 2.6. Transverse relaxation-optimized spectroscopy ple of C, N 15-mer DNA with upper strand labeled only (TROSY Lower spectrum: sample of C, N 15-mer DNA with lower strand abeled only(adapted from Fig. 8 of Ref. [271). Reprinted with the Recently, wuthrich and coworkers have proposed a permission of J Feigon and of Oxford University Press(O 1999) lew approach to reduce significantly transverse relaxation rates in multidimensional NMR experi- 2.5. Deuteration ments and thus eliminate one of the obstacles to the study of large molecules and complexes by NMR In the case of large protein-DNA complexes, the [34-36] conventional backbone triple resonance experiments The relaxation of backbone N nuclei is are unsuccessful for providing complete assignment dominated by the interaction between N of the protein resonances. Therefore, selective proto- nuclei and its directly attached proton and by the on and/or uniform complete or fractional de chemical shift sotropy interaction. As the N tion in combination or not withC, N-labeling of the CSA tensor is nearly axially symmetric and has its protein are used to simplify proton spectra( Fig 4)and axis making a small angle with the N-H bond vector, to overcome the problem of rapid transverse nuclear theN nuclei will have a relaxation rate depending on pin relaxation[28] the spin state of the proton attached to it. TROSY uses The structure of a 37 kDa trp repi this differential relaxation to select only the compo- DNA complex(homodimeric 107 residue E. coli trp nent which relaxes the more slowly. Using this
2.5. Deuteration In the case of large protein±DNA complexes, the conventional backbone triple resonance experiments are unsuccessful for providing complete assignment of the protein resonances. Therefore, selective protonation and/or uniform complete or fractional deuteration in combination or not with 13C,15N-labeling of the protein are used to simplify proton spectra (Fig. 4) and to overcome the problem of rapid transverse nuclear spin relaxation [28]. The structure of a 37 kDa trp repressor±operator DNA complex (homodimeric 107 residue E. coli trp repressor bound to a 20 base pair palindromic DNA operator) was determined by recording homonuclear 2D and 3D spectra for complexes with different deuterium labeled trp repressor analogs as well as heteronuclear spectra for complexes with uniformly 15N,13C-labeled trp repressor [29]. The use of perdeuterated protein in H2O (i.e. .90% 2 H incorporation at nonlabile positions and about 90% of labile positions protonated) led to the assignments of almost all backbone and Cb resonances of the 37 kDa trp repressor±operator DNA complex [30] and of a 64 kDa repressor±operator complex (two tandem dimers bound to a 22 base pair symmetric DNA operator and the corepressor analog 5-methyltryptophan) [31,32]. Samples of perdeuterated protein containing selective protonated or 15N,13C,1 H labeled residues are used to characterize speci®c contacts between the protein and the DNA. For example in the study of the DNA binding domain of the transcription factor NFATC1 bound to a 12 base pair DNA, Zhou and coworkers [33] performed 2D 1 H±1 H homonuclear NOESY experiment on complexes containing perdeuterated protein with fully protonated Tyr and Phe residues to characterize the contacts between Tyr 442 and DNA. These authors also mentioned the use of site-speci®c deuteration at C2 of Ade6 to con®rm the close proximity of Arg555 and Ade6. 2.6. Transverse relaxation-optimized spectroscopy (TROSY) Recently, WuÈthrich and coworkers have proposed a new approach to reduce signi®cantly transverse relaxation rates in multidimensional NMR experiments and thus eliminate one of the obstacles to the study of large molecules and complexes by NMR [34±36]. The relaxation of peptide backbone 15N nuclei is dominated by the dipolar interaction between 15N nuclei and its directly attached proton and by the chemical shift anisotropy interaction. As the 15N CSA tensor is nearly axially symmetric and has its axis making a small angle with the N±H bond vector, the 15N nuclei will have a relaxation rate depending on the spin state of the proton attached to it. TROSY uses this differential relaxation to select only the component which relaxes the more slowly. Using this N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 89 Fig. 4. Portion of 1 H±13C HSQC spectra at 298 K in D2O, showing the correlations between aromatic protons and carbons of a 15 base pair DNA containing the binding site of NHP6. Upper spectrum: sample of 13C,15N 15-mer DNA with upper strand labeled only. Lower spectrum: sample of 13C,15N 15-mer DNA with lower strand labeled only (adapted from Fig. 8 of Ref. [27]). Reprinted with the permission of J. Feigon and of Oxford University Press (q 1999)
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001)83-114 04 a 0.3 0.2 0.1 00bb855655650558505 sbs bsbsbsbs bsbs bsbsbsbs bsbs bs b 00 10 20 3.0 Fig. 5. Backbone(b) and side-chain(s)relaxation parameters of the Tlp(upper graph) and [H-NI NOE (lower graph) at 600 MHz for the ee(black bars)and the DNA-bound (hatched bars) lac repressor headpiece. The backbone and side-chain parameters are indicated with"b d"s",respectively. For Asn, 'side-chain'refers to the N GIn and Arg, this refers to N.( Fig. 4 from Ref. [401). Reprinted with the permission of R. Kaptein and of the American Chemical Society (o 1997) approach, Wuthrich and coworkers observed a signif- benefit from the implementation of the TROSY icant reduction in the linewidth for N and H in a 2D principle H,N correlation experiment performed with a uniformly N-labeled protein complex with a DNA 2 7. Long-range distance constr fragment at 750 MHz and 4C (TC 20+/-2 ns This TROSY principle has been implemented in the Bax and coworkers have proposed the use of the conventional triple resonance experiments HNCA, magnetic field dependence of the dipolar H-N and HNCO, HN(CO)CA, HN(CA)CO, HNCACB and H-C couplings [37] and of the N shift [38]to HN(CO)CACB. A 2-3-fold enhancement in the measure the orientation of Nh, Ch or Cc bond signal-to-noise ratio has been observed when applied vectors relative to the magnetic susceptibility tensor to H/C/N-labeled proteins and significant gains of Thus, these measurements will provide long-range sensitivity were measured or predicted for protonated constraints between distinct regions of the complex proteins. The highest sensitivity gains are obtained for Molecules with an anisotropic magnetic susceptibility the regular secondary structure elements in the protein will align along the static magnetic field to a degree core. Studies of protein-DNA complexes should which is proportional to the product of the anisotropy
approach, WuÈthrich and coworkers observed a significant reduction in the linewidth for 15N and 1 H in a 2D 1 H,15N correlation experiment performed with a uniformly 15N-labeled protein complex with a DNA fragment at 750 MHz and 48C tc 20 1 = 2 2 ns: This TROSY principle has been implemented in the conventional triple resonance experiments HNCA, HNCO, HN(CO)CA, HN(CA)CO, HNCACB and HN(CO)CACB. A 2±3-fold enhancement in the signal-to-noise ratio has been observed when applied to 2 H/13C/15N-labeled proteins and signi®cant gains of sensitivity were measured or predicted for protonated proteins. The highest sensitivity gains are obtained for the regular secondary structure elements in the protein core. Studies of protein±DNA complexes should bene®t from the implementation of the TROSY principle. 2.7. Long-range distance constraints Bax and coworkers have proposed the use of the magnetic ®eld dependence of the dipolar 1 H±15N and 1 H±13C couplings [37] and of the 15N shift [38] to measure the orientation of NH, CH or CC bond vectors relative to the magnetic susceptibility tensor. Thus, these measurements will provide long-range constraints between distinct regions of the complex. Molecules with an anisotropic magnetic susceptibility will align along the static magnetic ®eld to a degree which is proportional to the product of the anisotropy 90 N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 Fig. 5. Backbone (b) and side-chain (s) relaxation parameters of the T1r (upper graph) and [1 H±15N] NOE (lower graph) at 600 MHz for the free (black bars) and the DNA-bound (hatched bars) lac repressor headpiece. The backbone and side-chain parameters are indicated with ªbº and ªsº, respectively. For Asn, `side-chain' refers to the Nd ; Gln and Arg, this refers to Ne . (Fig. 4 from Ref. [40]). Reprinted with the permission of R. Kaptein and of the American Chemical Society (q 1997)
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38(2001)83-114 of the molecular magnetic susceptibility and the The most remarkable changes take place in the loop square of the magnetic field strength. As a result, between helices II and Ill: His29 within this loo the dipolar couplings or the chemical shifts vary contacts the DNA. A large decrease in backbone with the strength of the magnetic field and depend mobility within this loop is detected. The relaxation on the orientation of the bond vector or chemical parameters of mostN-containing side-chains hift tensors relative to the magnetic susceptibility ( GIn18, Arg22, Asn25, GIn26, Asn50, and Arg51) tensor. These small effects were observed for dna have also been measured (Fig. 5). Some of the side- or protein-DNA complexes due to the contributions chains of DNA-contacting residues show a significant of the stacked aromatic groups of the dNa bases to decrease in mobility upon dNa binding while others the magnetic susceptibility tensor. The dipolar are about equally mobile in both the free and the coupling restraints have been incorporated in the bound state. This indicates that interactions with simulated annealing protocol for structure determina- DNA do not necessarily restrict the mobility of the tion of the ce the DN ing domain of side-chain upon binding and that some flexibility GATA-1 with a 20 base pair DNA [37]. When remains at the interface between the protein and the ompared with the structure calculated without DNA. N TI measurements indicate that the side- H-IN andCa-H dipolar couplings, the overall chain of residues Gln18, Arg22 and Asn25 undergo precision of the coordinates increased only slightly intermediate exchange (us to ms time-scale) which but the percentage of residues in the most favorable may indicate that these atoms are changing partners region of the Ramachandran map and the number of in hydrogen bonds bad contacts improved significantly. A large displace The dynamics of the three aminoterminal zinc ment in the short loop connecting strands B3 and B4 fingers of X. laevis TFIlla(zf1-3) bound to a 15- was found. The magnetic field dependentN shifts mer DNA has been studied byN NMR [41]. The correlated well with the structure of the gatal flexibility of the backbone of the linker residues DNA complex refined with H-N and CaH (except Lys41)is significantly reduced upon DNA dipolar coupling constraints [38] binding. This reduction is associated with the forma tion of a defined conformation and close packing 2.8. Dynamic interactions between the side-chains within the linker and with the side-chains of the neighboring finger. Measurements of N spin-lattice and spin-spin Some flexibility has been found for the protein- relaxation rates as well as steady state H-N hetero- DNA interface as indicated by the broadening of reso- nuclear Noes ide information about internal nances or weak connectivities observed for some motions on the pico- to nanosecond time-scale and lysine resonances (Lys26, Lys29, Lys87). In fact, on conformational dynamics on the micro- to nano- analysis of the surface electrostatic potential at the econd time-scales [39]. The three examples given DNA binding site where these side-chains interact below, illustrate the role of dynamics in protei ggests that these fluctuations arise from the DNA recognition. The dynamics studies on lac repres- that these side-chains adopt different isoenergetic sor headpiece (1-56)[40] and on the three amino- conformations with different patterns of hydrogen terminal zinc fingers of X laevis TFIIIA [41] show bonds to DNA bases that the process of recognition is dynamic and not The essential Dna binding domain of the ADRI undergoes a disorder-to-order transition NTI, Tlo, and [H-N] NOE experiments were it binds to a 14 base-pair DNA duplex containing the performed on uniformly N-labeled free and DNA UASI binding site [13] as evidenced by Relaxation bound lac repressor headpiece(1-56)[40]. For the measurements. The free dNa binding domain of free lac repressor headpiece(1-56), the backbone of ADRI is composed of three distinct motional regions the three a-helices and of the turn of the hTh motif is and behaves like two beads linked by a flexible strin rather rigid, whereas the backbone of the loop Upon binding, most of this domain tumbles like a between helices I and ml is more mobile. Upon bind- single domain with reduced picosecond time-scale ing to the DNA, several changes in the mobility occur. motions compared to the free form
of the molecular magnetic susceptibility and the square of the magnetic ®eld strength. As a result, the dipolar couplings or the chemical shifts vary with the strength of the magnetic ®eld and depend on the orientation of the bond vector or chemical shift tensors relative to the magnetic susceptibility tensor. These small effects were observed for DNA or protein±DNA complexes due to the contributions of the stacked aromatic groups of the DNA bases to the magnetic susceptibility tensor. The dipolar coupling restraints have been incorporated in the simulated annealing protocol for structure determination of the complex of the DNA binding domain of GATA-1 with a 20 base pair DNA [37]. When compared with the structure calculated without 1 H±15N and 13Ca ±1 Ha dipolar couplings, the overall precision of the coordinates increased only slightly but the percentage of residues in the most favorable region of the Ramachandran map and the number of bad contacts improved signi®cantly. A large displacement in the short loop connecting strands b3 and b4 was found. The magnetic ®eld dependent 15N shifts correlated well with the structure of the GATA1± DNA complex re®ned with 1 H±15N and 13Ca ±1 Ha dipolar coupling constraints [38]. 2.8. Dynamics Measurements of 15N spin±lattice and spin±spin relaxation rates as well as steady state 1 H±15N heteronuclear NOEs provide information about internal motions on the pico- to nanosecond time-scale and on conformational dynamics on the micro- to nanosecond time-scales [39]. The three examples given below, illustrate the role of dynamics in protein± DNA recognition. The dynamics studies on lac repressor headpiece (1±56) [40] and on the three aminoterminal zinc ®ngers of X. laevis TFIIIA [41] show that the process of recognition is dynamic and not static. 15N T1, T1r, and [1 H±15N] NOE experiments were performed on uniformly 15N-labeled free and DNA bound lac repressor headpiece (1±56) [40]. For the free lac repressor headpiece (1±56), the backbone of the three a-helices and of the turn of the HTH motif is rather rigid, whereas the backbone of the loop between helices II and III is more mobile. Upon binding to the DNA, several changes in the mobility occur. The most remarkable changes take place in the loop between helices II and III: His29 within this loop contacts the DNA. A large decrease in backbone mobility within this loop is detected. The relaxation parameters of most 15N-containing side-chains (Gln18, Arg22, Asn25, Gln26, Asn50, and Arg51) have also been measured (Fig. 5). Some of the sidechains of DNA-contacting residues show a signi®cant decrease in mobility upon DNA binding while others are about equally mobile in both the free and the bound state. This indicates that interactions with DNA do not necessarily restrict the mobility of the side-chain upon binding and that some ¯exibility remains at the interface between the protein and the DNA. 15N T1r measurements indicate that the sidechain of residues Gln18, Arg22 and Asn25 undergo intermediate exchange (ms to ms time-scale) which may indicate that these atoms are changing partners in hydrogen bonds. The dynamics of the three aminoterminal zinc ®ngers of X. laevis TFIIIA (zf1-3) bound to a 15- mer DNA has been studied by 15N NMR [41]. The ¯exibility of the backbone of the linker residues (except Lys41) is signi®cantly reduced upon DNA binding. This reduction is associated with the formation of a de®ned conformation and close packing interactions between the side-chains within the linker and with the side-chains of the neighboring ®nger. Some ¯exibility has been found for the protein± DNA interface as indicated by the broadening of resonances or weak connectivities observed for some lysine resonances (Lys26, Lys29, Lys87). In fact, analysis of the surface electrostatic potential at the DNA binding site where these side-chains interact suggests that these ¯uctuations arise from the fact that these side-chains adopt different isoenergetic conformations with different patterns of hydrogen bonds to DNA bases. The essential DNA binding domain of the yeast ADR1 undergoes a disorder-to-order transition when it binds to a 14 base-pair DNA duplex containing the UAS1 binding site [13] as evidenced by 15N relaxation measurements. The free DNA binding domain of ADR1 is composed of three distinct motional regions and behaves like two beads linked by a ¯exible string. Upon binding, most of this domain tumbles like a single domain with reduced picosecond time-scale motions compared to the free form. N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114 91
N Jamin, F. Toma/ Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001)83-114 2.9. Hydration water molecules around the backbone amide proton of Ala30, Tyr34 and Tyr35 which are close to phosphate Water molecules are important contributors in the groups. This suggests that these water molecules process of protein-DNA recognition as they may participate in bridging hydrogen bonds between the have structural and /or functional roles sugar-phosphate backbone and the relevant amide NMR can provide information about the location group and lifetime of the contacts between water and th protein/DNA 3, 42, 43]. The residence times of hydra tion water can be estimated from the measurements of NOEs and rOEs between water protons and protein 3. Selected applications or DNA protons. These measurements distinguis residence times of less than 1 ns from longer ones. Table 1 summarizes the protein sequence Typically residence times shorter than I ns are motifs and DNA sequence of the protein-DNA observed on the surface of protein and in the major complexes discussed below. It also includes a groove ofDNA while residence times longer than I ns summary of the direct interactions between the have been observed for water molecules in the interior amino acid side-chains and the nucleic acid of proteins, in the minor grooves of DNA and in bases protein-DNA interfaces The NMR study of the Antennapedia homeodo- 3.1. The helix-turn-helix motif main-DNA complex reveals that water molecules are present at the protein-DNA interface: contacts The HTH motif consists of two nearly perpendicu between protein and water have been observed for The second helix of this motif called the"recognition lar a-helices separated by a link of variable lengt amino acid residues 43 44. 47. 48. 50. 51. 52 and 54(Fig. 6[44. These water molecules exchange helix" inserts into the major groove of the dna to slowly with the bulk solvent(residence times between make specific contacts. Variations between members ns and 20 ms)[45] similar to water molecules in the of the hTH family include the orientation of the helix nterior of proteins and have multiple preferred loca in the major groove, the position of the residues tions. In addition, two residues at the protein-DNA contacting the DNA and the length of the recogniti interface, Asn51(strictly conserved )and Gln50 (func- helix. This motif first identified in procaryotic gene tionally important), contact several DNA bases with regulatory proteins can be found in a wide variety of transient water mediated hydrogen bonds. The model DNA-binding proteins including eukaryotic homeo- proposed for the interactions between the protein and domains and transcription factors the DNA consists of a fluctuating network of hydro- gen bonds between the polar groups of the protein and 3.. Homeodomain the dNa and water molecules A homeodomain protein is the product of homeo- In contrast to other protein-DNA complexes, the box genes. It is a highly conserved DNA-binding complex between the dna binding domain of domain of about 60 amino acid residues that is chicken GATA-I and a 16 base pair duplex is char- found in transcriptional regulators involved in the acterized by only two hydrogen bonds between the genetic control of development. These regulators protein and the DNA [46]. The specific interactions specify to the embryonic cells the positional informa involve hydrophobic contacts between the methyl tion(where they are relative to their neighbors) and groups of the protein and the dna bases. Clore and the segmental identity(what structure they should coworkers have found water molecules around all generate). They act at various levels of the develop surface exposed methyl groups as well as around ment and in all organisms, from yeast to human methyl groups in the neighborhood of the sugar-phos- Mutations in the homeodomain could result in genetic phate backbone but the water molecules are excluded diseases and developmental abnormalities. Therefore, from the interface between the protein and the dna in order to understand the role of individual amino bases in the major groove [47]. They also observed acid residues in tertiary structure formation and
2.9. Hydration Water molecules are important contributors in the process of protein±DNA recognition as they may have structural and /or functional roles. NMR can provide information about the location and lifetime of the contacts between water and the protein/DNA [3,42,43]. The residence times of hydration water can be estimated from the measurements of NOEs and ROEs between water protons and protein or DNA protons. These measurements distinguish residence times of less than 1 ns from longer ones. Typically residence times shorter than 1 ns are observed on the surface of protein and in the major groove of DNA while residence times longer than 1 ns have been observed for water molecules in the interior of proteins, in the minor grooves of DNA and in protein±DNA interfaces. The NMR study of the Antennapedia homeodomain±DNA complex reveals that water molecules are present at the protein±DNA interface: contacts between protein and water have been observed for amino acid residues 43, 44, 47, 48, 50, 51, 52 and 54 (Fig. 6 [44]). These water molecules exchange slowly with the bulk solvent (residence times between 1 ns and 20 ms) [45] similar to water molecules in the interior of proteins and have multiple preferred locations. In addition, two residues at the protein±DNA interface, Asn51 (strictly conserved) and Gln50 (functionally important), contact several DNA bases with transient water mediated hydrogen bonds. The model proposed for the interactions between the protein and the DNA consists of a ¯uctuating network of hydrogen bonds between the polar groups of the protein and the DNA and water molecules. In contrast to other protein±DNA complexes, the complex between the DNA binding domain of chicken GATA-1 and a 16 base pair duplex is characterized by only two hydrogen bonds between the protein and the DNA [46]. The speci®c interactions involve hydrophobic contacts between the methyl groups of the protein and the DNA bases. Clore and coworkers have found water molecules around all surface exposed methyl groups as well as around methyl groups in the neighborhood of the sugar-phosphate backbone but the water molecules are excluded from the interface between the protein and the DNA bases in the major groove [47]. They also observed water molecules around the backbone amide proton of Ala30, Tyr34 and Tyr35 which are close to phosphate groups. This suggests that these water molecules participate in bridging hydrogen bonds between the sugar-phosphate backbone and the relevant amide groups. 3. Selected applications Table 1 summarizes the protein sequence motifs and DNA sequence of the protein±DNA complexes discussed below. It also includes a summary of the direct interactions between the amino acid side-chains and the nucleic acid bases. 3.1. The helix-turn-helix motif The HTH motif consists of two nearly perpendicular a-helices separated by a link of variable length. The second helix of this motif called the ªrecognition helixº inserts into the major groove of the DNA to make speci®c contacts. Variations between members of the HTH family include the orientation of the helix in the major groove, the position of the residues contacting the DNA and the length of the recognition helix. This motif ®rst identi®ed in procaryotic generegulatory proteins can be found in a wide variety of DNA-binding proteins including eukaryotic homeodomains and transcription factors. 3.1.1. Homeodomain A homeodomain protein is the product of homeobox genes. It is a highly conserved DNA-binding domain of about 60 amino acid residues that is found in transcriptional regulators involved in the genetic control of development. These regulators specify to the embryonic cells the positional information (where they are relative to their neighbors) and the segmental identity (what structure they should generate). They act at various levels of the development and in all organisms, from yeast to human. Mutations in the homeodomain could result in genetic diseases and developmental abnormalities. Therefore, in order to understand the role of individual amino acid residues in tertiary structure formation and 92 N. Jamin, F. Toma / Progress in Nuclear Magnetic Resonance Spectroscopy 38 (2001) 83±114