Lecture 21 Eukaryotic Genes and genomes III Cis-acting sequences In the last lecture we considered a classic case of how genetic analysis could be used to dissect a regulatory mechanism. This analysis was contingent upon having clean" phenotypes associated with the isolated mutants; e. g mutations in the Gal80 gene produce a phenotype of constitutive Gal1 expression. However, it is sometimes very difficult to identify regulatory proteins by isolating mutants because regulators that influence the expression of a wide variety of genes might be essential ( i.e. mutations in these could be lethal, or their mutant phenoty pes may be extremely complex and difficult to interpret One solution to this has been to work backwards from the cis-acting promoter sequences for particular genes to identifying the proteins that bind to them Let's take the Gall gene as an example. We have considered the fact that in the presence of galactose the Gal1 gene is transcriptionally upregulated (along with other Gal genes). What i haven't told you is the fact that if glucose is present in addition to galactose the induction of the gal genes simply does not occur! This is known as glucose repression. This makes physiological sense because glucose is a more efficient energy source for yeast, and is therefore as long as glucose is present? In fact, glucose represses a very large number o the preferred carbon source over galactose. Why bother metabolizing galactos of genes whose products metabolize a wide range of carbon sources(sucrose maltose, galactose etc) that are less energy efficient than glucose, as well as repressing a whole host of other genes Induction of gal genes in yeast It seems reasonable to expect that there is a transcriptional repressor that responds to glucose levels th repressor would be ineffective when glucose is low or absent and effective galctokinas when glucose is present. It also seems activity galactose reasonable that one could isolate trans- acting mutants that fail to repress galactose-induced Gal gene expression in the presence of glucose. However, it acTos a cose turns out that the very fact that glucose es such a large number of different genes made it difficult to GLUCOSE REPRESSION identify such mutants
Lecture 21 Eukaryotic Genes and Genomes III Cis-acting sequences In the last lecture we considered a classic case of how genetic analysis could be used to dissect a regulatory mechanism. This analysis was contingent upon having “clean” phenotypes associated with the isolated mutants; e.g., mutations in the Gal80 gene produce a phenotype of constitutive Gal1 expression. However, it is sometimes very difficult to identify regulatory proteins by isolating mutants, because regulators that influence the expression of a wide variety of genes might be essential (i.e., mutations in these could be lethal), or their mutant phenotypes may be extremely complex and difficult to interpret. One solution to this has been to work backwards from the cis-acting promoter sequences for particular genes to identifying the proteins that bind to them. Let’s take the Gal1 gene as an example. We have considered the fact that in the presence of galactose the Gal1 gene is transcriptionally upregulated (along with other Gal genes). What I haven’t told you is the fact that if glucose is present in addition to galactose, the induction of the Gal genes simply does not occur! This is known as glucose repression. This makes physiological sense because glucose is a more efficient energy source for yeast, and is therefore the preferred carbon source over galactose. Why bother metabolizing galactose as long as glucose is present? In fact, glucose represses a very large number of genes whose products metabolize a wide range of carbon sources (sucrose, maltose, galactose etc) that are less energy efficient than glucose, as well as repressing a whole host of other genes. GLUCOSE REPRESSION It seems reasonable to expect that there is a transcriptional repressor that responds to glucose levels; this repressor would be ineffective when glucose is low or absent, and effective when glucose is present. It also seems reasonable that one could isolate transacting mutants that fail to repress galactose-induced Gal gene expression in the presence of glucose. However, it turns out that the very fact that glucose represses such a large number of different genes made it difficult to identify such mutants. + galactose and gluc nd gluc nd gluc nd glucose
Instead of looking for mutants that fail to execute glucose repression at the Gal1 gene, studies of the Gall promoter region itself provided the key to dissecting the mechanism of glucose repression Specifically the Gal1 promoter region was fused to the e. coli Lacz gene, on a plasmid that can GAL1 Lacz GAL1 AR galactokinase CEN galactose +galactose glucose B-galactosidase -galactose galactose + galactose glucose replicate autonomously in S. cerevisiae. It was first important to establish that regulation of Lacz (B-galactosidase) from the plasmid mirrored the regulation of Gal1 ( galactokinase) from its chromosomal locus;i. e. that B-galactosidase was induced by galactose in the absence of glucose but not in its presence. Having established that it was possible to go on and interrogate subdomains of the Gall promoter region for their role in induction of Gal1 by galactose, as well as repression of Gal1 by glucose. The minimal length of DNA stretching upstream into the promoter region from the gall transcription start site (designated as adjacent to-1 was 400bp DNA. Once this functional promoter region was delineated, systematic deletions 400 base pairs upstream of the Gall transcription start site is enough to confer of 50bp or so could be made all across the proper Gall-like regulation upon Lacz 400 bp region; this is easy to do with some recombinant dna tricks that are not important to know about here. suffice to 400-300-200-100 gl golgol &glu say that this"deletion analysis"revealed two regions critical for transcriptional control as well as the location of the tata al1 Promoter Gall Transcription sequence that is required for loading of the start site basal transcription machinery. Mapping GALl promoter elements B-galactosidase -400 300 200 00 -gal +gal +gal glu LacZ Lacz Lacz 2345678 LacZ LacZ Lacz Lacz UAS URS TATA
Instead of looking for mutants that fail to execute glucose repression at the Gal1 gene, studies of the Gal1 promoter region itself provided the key to dissecting the mechanism of glucose repression. Specifically, the Gal1 promoter region was fused to the E. coli LacZ gene, on a plasmid that can replicate autonomously in S. cerevisiae. It was first important to establish that regulation of LacZ (β-galactosidase) from the plasmid mirrored the regulation of Gal1 (galactokinase) from its chromosomal locus; i.e., that β−galactosidase was induced by galactose in the absence of glucose, but not in its presence. Having established that, it was possible to go on and interrogate subdomains of the Gal1 promoter region for their role in induction of Gal1 by galactose, as well as repression of Gal1 by glucose. The minimal length of DNA stretching upstream into the promoter region from the Gal1 transcription start site (designated as adjacent to -1) was 400bp DNA. Once this functional promoter region was delineated, systematic deletions of 50bp or so could be made all across the 400 bp region; this is easy to do with some recombinant DNA tricks that are not important to know about here. Suffice to say that this “deletion analysis” revealed two regions critical for transcriptional control, as well as the location of the TATA sequence that is required for loading of the basal transcription machinery. 400 base pairs upstream of the Gal1 transcription start site is enough to confer proper Gal1-like regulation upon LacZ Gal1 Transcription start site Gal1 Promoter region 400 base pairs upstream of the Gal1 transcription start site is enough to confer proper Gal1-like regulation upon LacZ Gal1 Transcription start site Gal1 Promoter region 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
The expression of B-galactosidase from each of these promoter deletion constructs under minus-galactose, plus-galactose, and plus galactose glucose, are show. From these data we can deduce the location of cis-acting regulatory sequences for the Gall gene Deletions 7 and 8 do not express the reporter under any conditions because the deletions have removed some of the tAta sequence that is required for assembly of the basal transcription machinery Deletions 1 and 2 eliminate the ability of galactose to increase expression from the Gall promoter, and since expression is not induced there is nothing for glucose to repress. It turns out that the 75bp sequence between -310 and -385 is the dna binding site for gal4 and this kind of region is generally called a UAs (upstream activation sequence) and in this case UASGAL. We will come back to thinking about Gal4 binding to the UAs recognition sequence later. Deletions 3, 5 and 6 have no effect on the ability of galactose to induce expression because the UAs remains intact. Note that shortening the distance between the UAs and the tata region is not detrimental to induction. Indeed increasing the distance by inserting extra dna between the UAs and the tata sequence also has little effect on inducibility. This has led to the idea that UAS sequences can work at long distances (1,000-10,000 bp) away from the tata sequence and the transcription start sites. (In mammalian cells regions containing binding sites for transcriptional activators are called enhancers we will come to these in a later lecture) Deletion 4 turns out to reveal information about glucose repression For this construct, while galactose induces expression glucose is unable to repress that expression. the deleted region defines the position of a sequence element needed for glucose repression and a sequence element that behaves this way (i.e. are required for repression) is generally called a URS (upstream repressor sequence), and in this case URSGAL No/low Glucose High Glucose Snf kinase inactive After determining that there was a osporylates Mig1 Mig1 goes to nudeus. URS element controlling glucose binds in a complex to repression at the Gall gene promoter, it was possible to go on to find the Mig1 protein that binds the URSGAL sequence(which turns out to lie in the promoter regions of many genes besides Gal genes). The Snf1 complex is a kinase that under low glucose conditions actively phosphorylates the Mig1 repressor, preventing it from entering the
The expression of β−galactosidase from each of these promoter deletion constructs under minus-galactose, plus-galactose, and plus galactose & glucose, are show. From these data we can deduce the location of cis-acting regulatory sequences for the Gal1 gene. • Deletions 7 and 8 do not express the reporter under any conditions because the deletions have removed some of the TATA sequence that is required for assembly of the basal transcription machinery. • Deletions 1 and 2 eliminate the ability of galactose to increase expression from the Gal1 promoter, and since expression is not induced there is nothing for glucose to repress. It turns out that the 75bp sequence between -310 and -385 is the DNA binding site for Gal4 and this kind of region is generally called a UAS (upstream activation sequence) and in this case UASGAL. We will come back to thinking about Gal4 binding to the UAS recognition sequence later. • Deletions 3, 5 and 6 have no effect on the ability of galactose to induce expression because the UAS remains intact. Note that shortening the distance between the UAS and the TATA region is not detrimental to induction. Indeed increasing the distance by inserting extra DNA between the UAS and the TATA sequence also has little effect on inducibility. This has led to the idea that UAS sequences can work at long distances (1,000 – 10,000 bp) away from the TATA sequence and the transcription start sites. (In mammalian cells regions containing binding sites for transcriptional activators are called enhancers; we will come to these in a later lecture) • Deletion 4 turns out to reveal information about glucose repression. For this construct, while galactose induces expression, glucose is unable to repress that expression. The deleted region defines the position of a sequence element needed for glucose repression, and a sequence element that behaves this way (i.e., are required for repression) is generally called a URS (upstream repressor sequence), and in this case URSGAL. No/low Glucose High Glucose nucleus nucleus cytoplasm cytoplasm P Mig1 Snf1 Snf1 kinase active, phosporylates Mig1, prevents nuclear localization Snf1 Mig1 Mig1 Snf1 kinase inactive, Mig1 goes to nucleus, binds in a complex to the URS No/low Glucose High Glucose nucleus nucleus cytoplasm cytoplasm P Mig1 Snf1 Snf1 kinase active, phosporylates Mig1, prevents nuclear localization Snf1 Mig1 Mig1 Snf1 kinase inactive, Mig1 goes to nucleus, binds in a complex to the URS No/low Glucose High Glucose nucleus nucleus cytoplasm cytoplasm P Mig1 Snf1 Snf1 kinase active, phosporylates Mig1, prevents nuclear localization Snf1 Mig1 Mig1 Snf1 kinase inactive, Mig1 goes to nucleus, binds in a complex to the URS After determining that there was a URS element controlling glucose repression at the Gal1 gene promoter, it was possible to go on to find the Mig1 protein that binds the URSGAL sequence (which turns out to lie in the promoter regions of many genes besides Gal genes). The Snf1 complex is a kinase that under low glucose conditions actively phosphorylates the Mig1 repressor, preventing it from entering the
nucleus. This situation (low glucose) is permissive for galactose induction of Gall gene expression via the UAs. In high glucose the Snf1 kinase is inactivated so Mig1 is not phosphorylated and the unphoshorylated migl enters the nucleus, to bind its URS sequence where it recruits two other proteins that together achieve repression of Gall expression Modular properties of Transcription Activators The Gal4 transcriptional activator turns out to be one of the most well studied proteins to carry out this kind of function. Once again a Lacz reporter was used in an imaginative way to establish that the Gal4 protein has two functional domains that are separated by a flexible region in the protein this but deletions are made across the Gal4 protein; the inverse of keeping Galg time the Gal1 promoter region remains intact upstream of the lacz report ntact and making deletions along the promoter, as described above Gal4 protein deletion analysis Essentially, if the N-terminal domain Lacz Reporter construct: of the gal4 protein is deleted, the protein can not bind to the UASGAL DNA sequence and so is unable to ctivation activate transcription of the reporter gene. But, in addition to DNA 门c binding Gal4 must have a region near the c-terminal end that is H responsible for recruiting and activating the RNa polymerase, thus allowing expression of the rep 口+艹」9ene, The most remarkable thing of all, was that a large region in the center of Gal4 can be deleted; as long as the dna binding domain is present at the N-terminus and the activating domain is present at the c-terminus Gal4 can activate transcription from the UASGAL sequence Gal4 missense mutations tend to map to the Activation of GaL transcription DB or the aD regions No galactose DNA binding Activation eD)Golly RNA Gal4- Recessive constitutive DB A D This remarkable separation of function between these two domains of Gal4 was dramatically demonstrated by a series of experiments called domain swapping. Essentially using recombinant dNa techniques, the Gal4
nucleus. This situation (low glucose) is permissive for galactose induction of Gal1 gene expression via the UAS. In high glucose the Snf1 kinase is inactivated, so Mig1 is not phosphorylated, and the unphoshorylated Mig1 enters the nucleus, to bind its URS sequence where it recruits two other proteins that together achieve repression of Gal1 expression. Modular properties of Transcription Activators The Gal4 transcriptional activator turns out to be one of the most well studied proteins to carry out this kind of function. Once again, a LacZ reporter was used in an imaginative way to establish that the Gal4 protein has two functional domains that are separated by a flexible region in the protein. This time, the Gal1 promoter region remains intact upstream of the LacZ reporter, but deletions are made across the Gal4 protein; the inverse of keeping Gal4 intact and making deletions along the promoter, as described above. Essentially, if the N-terminal domain of the Gal4 protein is deleted, the protein can not bind to the UASGAL sequence, and so is unable to activate transcription of the reporter gene. But, in addition to DNA binding, Gal4 must have a region near the C-terminal end that is responsible for recruiting and activating the RNA polymerase, thus allowing expression of the reporter gene. The most remarkable thing of all, was that a large region in the center of Gal4 can be deleted; as long as the DNA binding domain is present at the N-terminus, and the activating domain is present at the C-terminus, Gal4 can activate transcription from the UASGAL sequence. Gal4 protein deletion analysis LacZ Reporter construct: Gal4 deletions: N- -C -C N- N- N- DNA binding lacZ UASGAL TATA lacZ UASGAL TATA N- DNA binding domain Activation domain LacZ activity + +++ - - + +++ + + + - -C + +++ Gal4 protein deletion analysis LacZ Reporter construct: Gal4 deletions: N- -C -C N- N- N- DNA binding lacZ UASGAL TATA lacZ UASGAL TATA N- DNA binding domain Activation domain LacZ activity + +++ - - + +++ + + + - -C + +++ N- -C DNA binding domain Activation domain N- -C DNA binding domain Activation domain DB AD Gal4 missense mutations tend to map to the DB or the AD regions Gal4- Recessive, uninducible Gal481 Dominant, constitutive N- -C DNA binding domain Activation domain N- -C DNA binding domain Activation domain DB AD Gal4 missense mutations tend to map to the DB or the AD regions Gal4- Recessive, uninducible Gal481 Dominant, constitutive G a l8 0 D B A D G a l8 0 D B A D This remarkable separation of function between these two domains of Gal4 was dramatically demonstrated by a series of experiments called domain swapping. Essentially, using recombinant DNA techniques, the Gal4
transcription activation domain(ad)was fused to the dna binding(db domain of an E coli protein called LexA; Lexa is a repressor that binds to a known DNA sequence, the LexA operator(LexA OP). Also, the Gal4 DB domain was fused to the ad transcription activation domain of a viral protein know to be a strong activator, VP16. These chimeric proteins were ntroduced into yeast cells with the appropriate Lacz reporter gene constructs and the results of these domain swapping experiments were dramatic. Pe LexA DNA-binding Gal4 activation domain Lacz E Gal4 DNA-bindine VP16 activation domain LexA OP TATA Two Lacz reporter constructs Two chimeric proteins Two derivatives of a Gal4" yeast strain were created one containing the Lacz reporter construct downstream of the GallUAS, and the other containing the Lacz reporter construct downstream of the LexA oP. The two different chimeric proteins were expressed in each strain and the ability to induce lacz activity monitored In addition the following constructs were also introduced into the two strains the Gall UAS LexA OP wild type Gal4 protein 9 al/+aa and a third chimeric 9 al/+oa protein with the Gal4+ activation domain of the Gal4 mutant protein LexA-Gal4AD 十 fused to the lexa db domain the results LexA-Gal4AD from these experiments clearly show that the AD Gal4-VP16AD and the db domains function independently of one another This series of experiments, while interesting and certainly revealing about the how the gal genes are regulated have turned out to have a profound effect on all of biological research because it contributed to the development of a widely used technology called the yeast two hybrid assay. This assay makes possible to determine whether two proteins interact with each other as a complex with long-lived interaction, and sometimes even when two proteins only interact transiently. To determine whether protein X interacts with either protein Y or protein Z one can do the following: fuse protein x to the Gal4 DB this chimeric protein is known as the bait, and it will attach to the UASgal that lies upstream of a
transcription activation domain (AD) was fused to the DNA binding (DB) domain of an E. coli protein called LexA; LexA is a repressor that binds to a known DNA sequence, the LexA operator (LexA OP). Also, the Gal4 DB domain was fused to the AD transcription activation domain of a viral protein know to be a strong activator, VP16. These chimeric proteins were introduced into yeast cells with the appropriate LacZ reporter gene constructs and the results of these domain swapping experiments were dramatic. Two LacZ reporter constru er constructs Two chimeric proteins Two derivatives of a Gal4- yeast strain were created, one containing the LacZ reporter construct downstream of the Gal1UAS, and the other containing the LacZ reporter construct downstream of the LexA OP. The two different chimeric proteins were expressed in each strain and the ability to induce LacZ activity monitored. In addition the following constructs were also introduced into the two strains: the wild type Gal4 protein and a third chimeric protein with the activation domain of the Gal481 mutant protein fused to the LexA DB domain. The results from these experiments clearly show that the AD and the DB domains function independently of one another. This series of experiments, while interesting and certainly revealing about the how the Gal genes are regulated, have turned out to have a profound effect on all of biological research because it contributed to the development of a widely used technology called the yeast two hybrid assay. This assay makes it possible to determine whether two proteins interact with each other as a complex with long-lived interaction, and sometimes even when two proteins only interact transiently. To determine whether protein X interacts with either protein Y or protein Z one can do the following: fuse protein X to the Gal4 DB, this chimeric protein is known as the bait, and it will attach to the UASGAL that lies upstream of a
reporter gene usually a selectable marker or Lacz, or both This bait lies in wait for an interaction with another protein. The gAL4 AD, is fused to either protein Y or protein Z. Should either one of these proteins be able to interact with protein X then the Gal4 AD region will become tethered to the UASGAL region and will recruit and activate the rna polymerase Gal4 DNA Binding Domain Protein x GAL4-binding site Reporter gene Protein y Gal4 Activation Positive interaction (Gal4-AD) Increased transcripti Protein z Gal4 Activation Domain GAL4-binding site Gal4 Chimeric Proteins can Interrogate Yeast Two-Hybrid Assay for Protein Protein-Protein interactions Protein Interactions Figure by MIT OCW Note that the protein x,y and z do not have to be yeast proteins the only requirement is that the dna coding sequence for the protein is available(which is now true for all of the genes from a wide variety of organisms); these sequences are then cloned such that they produce the appropriate Gal4 chimeric proteins In the previous two three lectures we have looked at one particular regulatory network in S. cerevisiae, and have employed a wide range of tools to understand this network. In the next lecture I will be telling you how these and other tools have evolved into technologies that allow us to look globally at gene regulation in eukaryotic cells
reporter gene, usually a selectable marker or LacZ, or both. This bait lies in wait for an interaction with another protein. The GAL4 AD, is fused to either protein Y or protein Z. Should either one of these proteins be able to interact with protein X then the Gal4 AD region will become tethered to the UASGAL region and will recruit and activate the RNA polymerase. Note that the protein X, Y and Z do not have to be yeast proteins; the only requirement is that the DNA coding sequence for the protein is available (which is now true for all of the genes from a wide variety of organisms); these sequences are then cloned such that they produce the appropriate Gal4 chimeric proteins. In the previous two three lectures we have looked at one particular regulatory network in S. cerevisiae, and have employed a wide range of tools to understand this network. In the next lecture I will be telling you how these and other tools have evolved into technologies that allow us to look globally at gene regulation in eukaryotic cells. No interaction Gal4-BD x Y Gal4-AD Gal4 Chimeric Proteins can Interrogate Protein-Protein Interactions Yeast Two-Hybrid Assay for ProteinProtein Interactions Protein Y Gal4 Activation Domain Gal4 DNA Binding Domain Protein X Protein Z Gal4 Activation Domain A GAL4-binding site Positive interaction Increased transcription Gal4-BD Gal4-AD GAL4-binding site x z Reporter gene Reporter gene B Figure by MIT OCW