26-1 Structure and Properties of Amino Acids CHAPTER 26 Amino Acids,Peptides,Proteins and Nucleic Acids: 品 Nitrogen-Containing Polymers in Nature (.Kw巴 aeaa8a32ncwensna mino acid structure in aqueous solution depends upon the 脚 p渊6 PatpH>13 efers to therumrelaing the rt 4NCo+40一HcH,c00+HO -0 IHNCH-COOHI
1 CHAPTER 26 Amino Acids, Peptides, Proteins and Nucleic Acids: Nitrogen-Containing Polymers in Nature 26-1 Structure and Properties of Amino Acids The stereocenter of common 2-amino acids has the S configuration except for glycine and L-cysteine. Although more than 500 amino acids exist in nature, only 20 commonly occur in the proteins from all living species. Adult humans can synthesize all but eight of these amino acids and only two of the remaining 12 in insufficient quantities. These are called the essential amino acids. In all but glycine, the simplest amino acid, C2 is a stereocenter and is in the S (L) configuration, while the R (L) configuration for cysteine. Amino acids are acidic and basic: zwitterions. Amino acids contain both an amino group and a carboxylic acid group and are thus both acidic and basic. In the solid state, both the amino group and the carboxylic acid group are ionized forming a zwitterion. Most amino acids are fairly insoluble in organic solvents and decompose, rather than melt, when heated. The amino acid structure in aqueous solution depends upon the pH: Predominates Predominates Predominates at pH 13 For glycine the pKa1 refers to the equilibrium relating the first two structures above:
The second pair of structures are related by the equilibrium At the isoelectric point,the charges are neutralized. p-KP达-me2+96-60 sde hain,the ea他ane p basic because the pro -o 色 H-C-NH: roups,forming. 2-2 Hell-vohaeikywdby 2
2 The second pair of structures are related by the equilibrium: At the isoelectric point, the charges are neutralized. A pH exists for each amino acid at which the structure is electrically neutral. This pH is called the isoelectric pH, or pI. For an amino acid containing no addition ionizable groups, the isoelectric pH is simply the average of its two pKa values: For the four amino acids having an acidic side chain, the pI is the average of the two lowest pKa values. For the three amino acids having a basic side chain, the pI is the average of the two highest pKa values. The amino acids arginine and histidine are both basic and contain the two new groups guanidino and imidazole, respectively: The imidazole group is relatively basic because the protonated species is resonance-stabilized: Because its pI (7.6) is close to physiological pH, histidine functions as a proton acceptor and donor at the active sites of many enzymes. The amino acid cysteine contains a relatively acidic mercapto group (pKa = 8.2). Peptide strands containing cysteine residues are often crosslinked through oxidation of their cysteine mercapto groups, forming stable –S-S- linkages. Synthesis of Amino Acids: a Combination of Amine and Carboxylic Acid Chemistry 26-2 Hell-Volhard-Zelinsky bromination followed by amination converts carboxylic acids to 2-amino acids. Amino acids can be produced using the Hell-Volhard-Zelinsky bromination followed by displacement of the halogen by ammonia, however, the yields are low. The Gabriel synthesis can be adapted to produce amino acids. A much better synthesis of amino acids involves an adaptation of the Gabriel synthesis which uses diethyl 2-bromopropanedioate in the first step of the reaction sequence:
the relkylated,llo gan5yermarniyeversibleadoaltehvdestofonm es8cgmehsasaonteaatseoaat ker Srrhesls Nanine 26-3 Synthesis of Enantiomerically Pure Amino Acids 品 -Pepide o:Amn Acidiomrsand arRaeaeaoeet8mgg5nemesms:noH eptide bonds inhuntrmnav w.i..fo 3
3 The initially formed 2-propanedioate can be alkylated, allowing for the preparation of substituted amino acids. Amino acids are prepared from aldehydes by the Strecker synthesis. Nitriles ordinarily reversible add to aldehydes to form cyanohydrins: The Strecker synthesis utilizes a variation of this reaction in the presence of ammonia: 26-3 Synthesis of Enantiomerically Pure Amino Acids Resolution of the racemic mixtures of amino acids produced by the preceding methods can be accomplished by reaction with an optically active amine, such as brucine, followed by fractional crystallization: Unfortunately, in practice this method is tedious and can suffer from poor yields. An alternate approach is to form the stereocenter at C2 enantioselectively, as in enantioselective hydrogenations of α,β- unsaturated amino acids. Nature employs this strategy by using two enzymes: NADH, glutamate dehydrogenase and a transaminase: (S)-Glutamic acid is the biosynthetic precursor of glutamine, proline and arginine. (S)-Glutamic acid also functions to aminate other 2-oxo acids via a transaminase, thus making other amino acids available. Peptides and Proteins: Amino Acid Oliomers and Polymers 26-4 Amino acids form peptide bonds. 2-Amino acids are the monomer units in polypeptides. The amide bond formed between the carboxylic acid function of one amino acid and the amino group of a second amino acid is called a peptide bond. The individual amino acid units within the polypeptide are referred to as residues. In some proteins, two or more polypeptide chains are linked by disulfide bridges
The peptide bond is planar and The N-H hy BoiRegtaraiuharactetzedbyhersequenceol eaen2aadngao8peple2nahaacd,sawarplacedatt aaoeaeRe ee.heehaeaheeragRe,P2aeboniand Several small peptides are physiologically significant: n pep ad insulin is a hormone which circulates in the bloo herre the leated 4
4 The peptide bond is planar and fairly rigid at room temperature. The N-H hydrogen is almost always located trans to the carbonyl oxygen and rotation about the C-N bond is slow (the C-N bond has partial double-bond character). Both bonds adjacent to the amine function enjoy free rotation. Polypeptides are able to assume many different conformations, however, one conformation, the native state of the peptide, usually has a much lower free energy than the other conformations. Polypeptides are characterized by their sequence of amino acid residues. The amino end, or N-terminal amino acid, is always placed at the left when drawing a polypeptide chain. The C-terminal amino acid will be assumed to be on the right and the configuration at all of the C2 stereocenters will be assumed to be S (L). The main chain is the chain incorporating the peptide bonds and the side chains are the substituents, R, R’, R’’, etc. Naming of polypeptides simply starts at the amino terminal end and lists the names of the amino acids in sequence. Three letter abbreviations are often used. Several small peptides are physiologically significant: Aspartame, a dipeptide, is an artificial sweetener (NutraSweet). Glutathione functions as a biological reducing agent and is unusual in that it contains a γ- carbonyl peptide bond. Gramicidin S is a cyclic peptide antibiotic. Two identical pentapeptides have been joined head to tail. It contains the rare amino acid ornithine. Insulin is a hormone which circulates in the blood and helps regulate the concentration of blood glucose. It contains two polypeptide chains and three disulfide bonds. Proteins fold into pleated sheets and helices: secondary and tertiary structures. The sequence of amino acids in a peptide chain is called its primary structure. As the peptide chain folds into its most stable conformation, the arrangement of close-lying amino acid residues is called its secondary structure. Two important types of secondary structure are the pleated sheet, or β-structure, and the α-helix
hitocontruct osp0eethgnscnobceomea8aienm -Pleated sheets impart cons ble rigidity to the syste P o h aniets of thehelare IR68g股e8 cem ohgaG :DeUhaebndoes g。very wne2ne eR 8g 5
5 In the pleated sheet, the two chains line up with the carboxy groups of one chain opposite the amino groups of another. Additional chains may be bonded to either side to construct a “sheet” of chains connected by hydrogen bonds. This arrangement of chains can also be formed by a single polypeptide chain folding back and forth on itself several times. β-Pleated sheets impart considerable rigidity to the system. The α-helix is formed by hydrogen bonds between carbonyl groups and amino groups 3.6 amino acids apart in the amino acid sequence. The carbonyl group of one amino acid hydrogen bonds with the amino group of the amino acid four residues ahead in the sequence. Two equivalent points in adjacent turns of the helix are 5.4 Å apart. Too much charge of the same kind or the presence of the amino acid proline may disrupt secondary structure. The final overall folding of the entire polypeptide chain is called the tertiary structure of the chain. A variety of forces are involved in stabilizing the tertiary structure. •Disulfide bridges •Hydrogen bonds •London forces •Electrostatic attraction and repulsion •Micellar effects (hydrophobic effect) Pronounced folding is found in the globular proteins (chemical transport, catalysis, etc.) In fibrous proteins (myosin, fibrin, α-keratin), several α-helices are coiled together to form a superhelix. Enzymes and transport proteins fold up in such a way to produce three dimensional pockets or groves on their surfaces called active sites or binding sites. The size and shape of these sites provide a very specific fit for the intended substrate or ligand. The inner surface of an active site generally contains a specific arrangement of side chains of polar amino acids that attract functional groups on the substrate by hydrogen bonding or ionic interactions. Active sites align the functional groups on the enzyme and substrates in such a way as to promote the associated chemical reaction. A typical example of an enzyme is chymotprysin which catalyses the hydrolysis of specific peptide bonds (adjacent to phenylalanine, tyrosine or tryptophan) at physiological temperature and pH. Exposure of a protein to extremes of heat or pH usually causes denaturation, or breakdown, of the tertiary structure of a protein. In some proteins, several polypeptide chains, each with its own tertiary structure, assemble to form a larger structure called a quaternary structure
Second,determine which amino acids are present. First,purify the polypeptide chemica ssive application of a variet Gelnitra(size,shape) aphy (charge) 旋所流 ence the peptide from the amino(N-terminal) h8e3oseeaanmeteomcn The chopping up of longer chains is achieved with ca8e2gnenacacicee inconme品o5 o。 6
6 Determination of Primary Structure: Amino Acid Sequencing 26-5 First, purify the polypeptide. Protein purification involves the successive application of a variety of physical chemical techniques: Dialysis (size) Gel filtration (size, shape) Ion-exchange chromatography (charge) Electrophoresis (charge) Affinity chromatography (specific binding ability) Second, determine which amino acids are present. The peptide is first subjected to hydrolysis using 6 N HCl at 110oC for 24 hours. The numbers and types of free amino acids present are then determined using an automated amino acid analyzer. Sequence the peptide from the amino (N-terminal) end. The sequence of amino acids in a peptide chain is next determined using an Edman degradation. In this process, one amino acid at a time in the form of a phenylthiohydantoin is released from the N-terminus of the polypeptide chain. Since each amino acid produces a different phenylthiohydantoin, the amino acid sequence can be readily determined. The chopping up of longer chains is achieved with enzymes. The Edman degradation can only be used for relatively short peptides (about 50 residues). For longer peptides it is necessary to break the chains into specific shorter fragments using a selective chemical or enzymatic process. These fragments can then be isolated and individually sequenced. The order of the fragments within the original peptide must next be determined. A second fragmentation of another sample of the original peptide is carried out using a different chemical or enzymatic process. The two sets of peptide fragments are examined for overlap peptides which allow both sets of fragments to be correctly assembled. Given the three peptides produced by trypsin cleavage, it is impossible to tell whether the fragment ending in Arg or the fragment ending in Lys started the original peptide chain. The single Ala fragment can be identified as the C-terminal fragment since it is the only fragment that does not end in Lys or Arg. Hydrolysis of the B-chain by a different proteolytic enzyme (or determination of the N-terminal amino acid of the intact B-chain) is necessary to completely order all fragments
Poteqnmadeoseby pcaioofchemalseeencngmehedol1egbl000reside) peptide ynthesrprotectin f tofore thepepidebond heee9martawereael9Seaetoheemeaoa 二 mprkadfGb-M 26-7 Merrifield Solid-Phase Peptide Synthesis ,C0 C-00 7
7 Protein sequencing is made possible by recombinant DNA technology. Application of chemical sequencing methods to long (>1,000 residues) polypeptides is expensive, laborious and time consuming. When a sample of DNA or messenger RNA that codes for a protein is available, recombinant DNA technology can be used to easily and quickly determine the protein sequence through the DNA sequence and knowledge of the genetic code. Synthesis of Polypeptides: a Challenge in the Application of Protecting Groups 26-6 Selective peptide synthesis requires protecting groups. Chemical synthesis of a polypeptide chain from individual amino acids requires: Activation of the amino acids to force the peptide bond to form; Protection of reactive functional groups within the amino acids that may interfere with the desired peptide bond formation. The amino end of an amino acid can be blocked in one of several ways: Peptide bonds are formed by using carboxy activation. The most commonly used carboxy-activating reagent is dicyclohexylcarbodiimide (DCC). To produce a tripeptide, deprotection of only one end is required, followed by renewed coupling. 26-7 Merrifield Solid-Phase Peptide Synthesis The Merrifield solid-phase peptide synthesis, uses a solid support of polystyrene to anchor a peptide chain. A dipeptide synthesis using the Merifield method would proceed as follows:
ieamreaeoTetayyamahn au-Cwm 2 vee moglobn in the water m ule (de 8ceehee92Pnsptooe6ago22abind 26-9 Biosynthesis of Proteins:Nucleic Acids 5ourchesoraCieneernhe ween the four chains gives 8
8 The solid phase synthesis has the great advantage that all the reaction intermediates are immobilized on the polymer and can be isolated by simple filtration and washing. The Merrifield synthesis is performed automatically by a machine that requires only a few hours for each cycle. The total synthesis of insulin (51 amino acids) took more than 5,000 separate steps, but only took several days using the Merrifield automated procedure. Polypeptides in Nature: Oxygen Transport by the Proteins Myoglobin and Hemoglobin 26-8 Two proteins carry oxygen in vertebrates, myoglobin in the muscles and hemoglobin in the blood. Both proteins contain a special nonpolypeptide unit called a heme group attached to the protein. Heme is a cyclic organic molecule (poryphrin) made up of four linked, substituted pyrrole units surrounding an iron atom. The fifth coordination position of the heme groups is occupied by a nitrogen atom from a histidine imidazole group. The sixth position is occupied by an oxygen molecule (oxyhemoglobin) or a water molecule (deoxyhemoglobin). A second histidine group is positioned close to the oxygen binding site of the heme, which helps prevent the binding of carbon monoxide. Hemoglobin consists of four polypeptide chains: two α-chains of 141 residues each and 2 β-chains of 146 residues each. Each chain has its own heme group and a tertiary structure similar to that of myoglobin. The arrangement and interaction between the four chains gives hemoglobin its quaternary structure. 26-9 Biosynthesis of Proteins: Nucleic Acids Four heterocycles define the structure of nucleic acids. DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) are polymers of phosphoric acid, a sugar (ribose in RNA and deoxyribose in DNA), and four nitrogen bases (ACTG in DNA and ACUG in RNA)
%。 form a NA is a c If a plece of NA in one strand has the squence-AGCTACGAT 配荷 物 A一T一G- Aiguapan 26-10 Protein Synthesis Through RNA ach 2w 9
9 The combination of sugar and base is called a nucleoside. A sugar, base, and one or more phosphates is called a nucleotide. The sequence of individual nucleotides (4) within a particular strand of DNA taken three at a time (64) specify the sequence of amino acids (20) in a particular polypeptide chain. Nucleic acids form a double helix. DNA is a double helix consisting of two polynucleotide strands hydrogen-bonded to each other through their organic bases and running in opposite directions. If a piece of DNA in one strand has the sequence –AGCTACGATC-: The individual base pairing energies are: GC: 27.7 kcal mol-1 and AT: 15.1 kcal mol-1. DNA replicates by unwinding and assembling new complementary strands. The double helical nature of DNA suggested to Watson and Crick a mechanism for the replication of DNA during cell division. Each of the two strands in a parent DNA molecule serve as a template for the assembly of a new complementary strand in a new DNA molecule. This process occurs with an error frequency of less than 1 in 10 billion base pairs during DNA replication. 26-10 Protein Synthesis Through RNA The expression of the protein sequence information contained in DNA occurs in two steps, transcription and translation. Transcription is the process of copying a short sequence of DNA coding for one or a small number of proteins into a separate complementary molecule of mRNA. Translation is the process of reading the bases of the mRNA in groups of three and converting this information into a polypeptide sequence by means of small adapter molecules called transfer RNAs (tRNAs) and large catalytic assemblies called ribosomes
ogy ae$828a has deciphered the human E586 10
10 Each tRNA contains three bases which are complementary to the group of three bases on the mRNA, which specify a particular amino acid. At another site on the tRNA, that particular amino acid is attached by an enzyme known as an aminoacyl synthetase. The function of the ribosome is to simply match specific tRNA’s to the groups of three bases on the mRNA, and to polymerize the amino acids, thus assembled. DNA Sequencing and Synthesis: Cornerstones of Gene Technology 26-11 Rapid DNA sequencing has deciphered the human genome. In a method similar to that employed in protein sequencing, the long DNA molecule is first cleaved at specific points into more manageable fragments using enzymes known as restriction nucleases. There are more than 200 such enzymes. The sequence of each individual shortened fragment of DNA can then be determined by a chemical (Gilbert-Maxam) or enzymatic (Sanger) procedure. In the Gilbert-Maxam method, a sample of polypeptide is first labeled at its 5’ end with radioactive 32P to enable its detection after the next step. Next, four individual DNA samples are each subjected to chemical degradation. Each degradation is specific for one of the four bases present and cleaves the polynucleotide chain at that position. The concentration of the reagent is adjusted so that each molecule in the sample is cleaved only once. This results in all possible fragments starting at the radioactive label and ending with a particular instance of the base being analyzed. Electrophoresis is used to separate the various fragments produced from the four degradations. The movement of the fragments in the electric field is proportional to their charge (in this case length), which allows the overall sequence to be determined. In the Sanger, method the piece of DNA to be analyzed (the template strand, starting from the 3’ end) is replicated many times by the enzyme DNA polymerase using a mixture of all four deoxynucleoside triphosphates as substrates. The process is started by adding a short piece of complementary DNA known as the primer strand, which is then extended by the DNA polymerase. Again four different experiments are performed. In each experiment, one of the 4 target bases is selected and a small amount of the corresponding fluorescent-dye labeled dideoxyribonucleoside triphosphate is added to the mixture. The concentration of the dideoxy compound is selected so that approximately one molecule would be incorporated per newly synthesized DNA strand. The incorporation of a dideoxy molecule terminates the synthesis of the new chain. This results in a labeled set of all possible sequences ending with the target base, just as in the GilbertMaxam method