7.91 Amy Keating The protein interactome a critical framework underlying systems biology Overview-the many levels of systems biology 2. Experimental methods for measuring protein-protein interactions and their limitations 3. Data sources for information about proteins and their interactions 4. Computational methods for assessing and predicting protein-protein interactions
The Protein Interactome A critical framework underlying systems biology 1. Overview - the many levels of systems biology 2. Experimental methods for measuring protein-protein interactions, and their limitations 3. Data sources for information about proteins and their interactions 4. Computational methods for assessing and predicting protein-protein interactions. 7.91 Amy Keating
Spectrum of Systems Biology detailed models-describe rates concentrations structure low-resolution models-describe information flow, logic, mechanism circuitry logic/controI-positive and negative regulation connectivity/topology-who talks to who? interaction scaffold parts list- protein and DNA sequences( structures) Recommended reading: Ideker Lauffenburger, TRENDS in Biotechnology(2003)21, 255-262
Spectrum of Systems Biology detailed models - describe rates, concentrations, structure low-resolution models - describe information flow, logic, mechanism circuitry logic/control - positive and negative regulation connectivity/topology - who talks to who? interaction scaffold parts list - protein and DNA sequences (& structures) Recommended reading: Ideker & Lauffenburger, TRENDS in Biotechnology (2003) 21, 255-262
Spectrum of Systems Biology allow simulation detailed models- differential equations comparison with data loW-resolution models boolean markov models circuitry logic/control- Bayesian networks connectivity/topology -graph theory parts list-databases Recommended reading: Ideker Lauffenburger, TRENDS in Biotechnology(2003)21, 255-262
Spectrum of Systems Biology allow simulation & detailed models - differential equations comparison with data low-resolution models - Boolean & Markov models circuitry logic/control - Bayesian networks connectivity/topology - graph theory parts list - databases Recommended reading: Ideker & Lauffenburger, TRENDS in Biotechnology (2003) 21, 255-262
Spectrum of Systems Biology detailed models- rates of individual reactions, protein concentrations in the cell, extent of phosphorylation, diffusion rates low-resolution models-which elements are most crucial? combinatorial dependencies. circuitry logic/control-Expression profiling, post-translational modifications in response to different stimuli. Identify pathways and clusters; does an interaction activate or repress; are multiple components required? connectivity/topology-protein-protein, protein-DNA and protein-small molecule interactions parts list- genome sequencing projects, gene finding algorithms EST libraries Recommended reading: Ideker Lauffenburger, TRENDS in Biotechnology(2003)21, 255-262
Spectrum of Systems Biology detailed models - rates of individual reactions, protein concentrations in the cell, extent of phosphorylation, diffusion rates low-resolution models - which elements are most crucial? combinatorial dependencies. circuitry logic/control - Expression profiling, post-translational modifications in response to different stimuli. Identify pathways and clusters; does an interaction activate or repress; are multiple components required? connectivity/topology - protein-protein, protein-DNA and protein-small molecule interactions parts list - genome sequencing projects, gene finding algorithms, EST libraries Recommended reading: Ideker & Lauffenburger, TRENDS in Biotechnology (2003) 21, 255-262
Spectrum of Systems Biology detailed models YAFFE low-resolution models-not covering this topic much this year circuitry logic/control- BURGE connectivity/topology -KEATING (today) parts list BURGE Recommended reading: Ideker Lauffenburger, TRENDS in Biotechnology(2003)21, 255-262
Spectrum of Systems Biology detailed models YAFFE low-resolution models - not covering this topic much this year circuitry logic/control - BURGE connectivity/topology - KEATING (today) parts list BURGE Recommended reading: Ideker & Lauffenburger, TRENDS in Biotechnology (2003) 21, 255-262
Protein-protein and protein-DNA interactions at the genomic level Saccharomyces cerevisiae as a model organism A very simple eukaryote -yeast as a model for human Genome 12,053 kb sequenced in 1996 M5800 protein-coding genes Easy to do genetics in yeast. Many regulatory and metabolic pathways are at least partly conserved between yeast and higher eukaryotes Many human disease genes have yeast orthologs saccharomyces cerevisiae image from SGDM, provided by peter Hollenhorst and Catherine Fox. Used with permission
Protein-protein and protein-DNA interactions at the genomic level Saccharomyces cerevisiae as a model organism. A very simple eukaryote - “yeast as a model for human” Genome 12,053 kb sequenced in 1996. ~5800 protein-coding genes. Easy to do genetics in yeast. Many regulatory and metabolic pathways are at least partly conserved between yeast and higher eukaryotes. Many human disease genes have yeast orthologs. Saccharomyces cerevisiae image from SGD ™, provided by Peter Hollenhorst and Catherine Fox. Used with permission
Small-scale interaction experiments Protein-protein interactions pull-down(GST, Ni affinity, Co-immunoprecipitation) cross-linking more biophysical quantitative: fluorescence, CD calorimetry surface plasmon resonance Protein-DNa interactions mostly by gel sh hift assay Many, many thousands of such experiments have been done and reported in the literature but how do you get the information out? This is hard, and an important problem in modern biology PreBiND is a machine learning application that can extract information about whether two proteins interact from the literature automatically http://www.blueprint.orglproducts/prebind/prebind.html Small-scale experiment are generally the most reliable, though still rife with false negatives and false positives
Small-scale interaction experiments Protein-protein interactions pull-down (GST, Ni affinity, co-immunoprecipitation) cross-linking more biophysical & quantitative: fluorescence, CD, calorimetry, surface plasmon resonance Protein-DNA interactions mostly by gel shift assay Many, many thousands of such experiments have been done and reported in the literature, but how do you get the information out? This is hard, and an important problem in modern biology. PreBIND is a machine learning application that can extract information about whether two proteins interact from the literature automatically. http://www.blueprint.org/products/prebind/prebind.html Small-scale experiment are generally the most reliable, though still rife with false negatives and false positives
Yeast 2-hybrid assay Vector with activation domain --oRF fusion oR Mate HS了 ○ east Vector with DNA-binding domain-B fusion plate on-His media 墨 OFF 匚H3 Images:http:/depts.washingtonedu/sfields/yp_interactions/yplm.html Courtesy of Stanley Fields. Used with permission
Yeast 2-hybrid assay mate yeast Vector with activation domain--ORF fusion Vector with DNA-binding domain--B fusion plate on -His media Images: http://depts.washington.edu/sfields/yp_interactions/YPLM.html Courtesy of Stanley Fields. Used with permission
Yeast 2-hybrid assay ros Cons easy /fast prone to false negatives protein doesn t fold no purification required protein doesn t localize to nucleus interference from endogenous protein in vivo conditions fusion protein doesn ' t interact like native protein fusion may be toxic to cell can be adapted for high-throughput screens prone to false positives auto-activation can detect transient interactions indirect interactions not quantitative no control over post-translational modification ly test binary interactions not quantitative
Yeast 2-hybrid assay Pros Cons easy/fast prone to false negatives protein doesn’t fold no purification required protein doesn’t localize to nucleus interference from endogenous protein in vivo conditions fusion protein doesn’t interact like native protein fusion may be toxic to cell can be adapted for high-throughput screens prone to false positives auto-activation indirect ineractions can detect transient interactions not quantitative no control over post-translational modification only test binary interactions not quantitative
Yeast 2-hybrid assay for an entire genome Uetz et al. Nature(2000)403, 623-627 Two strategies 1.array approach: 6,000 activation domain hybrid transformants mated to 192 dNA binding domain fusion transformants only 20% of interactions (281 reproducible(many auto-activate) 3. 3 positives per interaction-competent protein 2.high-throughput screen"approach: 5, 345 ORFs cloned separately into DNA-binding and activation domain plasmids(2 reporter genes ) DBd fusions pooled and mated to ad fusions 12 clones per pool sequenced, gave 692 unique interactions (472 seen more than once 1. 8 positives per interaction-competent protein Ito et al.PNAS(2001)98,45694574 For both dBd and ad, make 62 pools of a96 proteins. Mate all pools against all Gave 4 549 interactions; 841 observed >3 times core data) The potential number of interactions is huge, and the number of real interactions is probably very large( 10,000); these studies only characterize a tiny fraction (low coverage)
Yeast 2-hybrid assay for an entire genome Uetz et al. Nature (2000) 403, 623-627 Two strategies: 1. “array” approach: ~6,000 activation domain hybrid transformants mated to 192 DNA binding domain fusion transformants only 20% of interactions (281) reproducible (many auto-activate) 3.3 positives per interaction-competent protein 2. “high-throughput screen” approach: 5,345 ORFs cloned separately into DNA-binding and activation domain plasmids (2 reporter genes); DBD fusions pooled and mated to AD fusions; 12 clones per pool sequenced, gave 692 unique interactions (472 seen more than once) 1.8 positives per interaction-competent protein Ito et al. PNAS (2001) 98, 4569-4574 For both DBD and AD, make 62 pools of ~96 proteins. Mate all pools against all. Gave 4,549 interactions; 841 observed ≥ 3 times (= core data). The potential number of interactions is huge, and the number of real interactions is probably very large (>10,000); these studies only characterize a tiny fraction (low coverage)