实验十五 Using the computer in biochemical research I. Introduction and theory The modern computer has revolutionized the way we live. Not surprisingly, the computer has also changed the way we do biochemical research. Your first encounter with a computer in this laboratory will probably be while using an instrument that has a computer to control its operation, to collect data, and to analyze data. All major pieces of scientific quipment including UV-VIs spectrometers, high-performance liquid chromatographs, gas chromatographs, nuclear magnetic resonance spectrometers, and DNA sequencers are now controlled by computers. But your use of the computer will not end in the lab. You will use a computer to prepare each laboratory report including graphical analysis of experimental data. If the computer is connected to the Internet, you will greatly broaden its use to some of the following: (1) searching the biochemical literature for pertinent books and journal articles and(2)accessing biological databases that provide nucleic acid and protein sequences and protein structures. Personal computing in biochemistry It is now possible for most students to purchase a basic computer system at low cost. Some recommendations for specific hardware and software will be given here but one must be aware that new products and important upgrades are continually being developed For word processing( writing lab reports), basic software programs including Microsoft Word and Word Perfect are most widely used. Software specialized for scientific writing is available but probably not necessary at this level. For many experiments that you complete, you will need to present data in a spreadsheet or graphing or spreadsheet with graphing capability include Lotus, Excel, Sigmaplot, Quattropro, Kaleidagraph, and Cricket Craph. Some graphs that you prepare from experimental data will be nonlinear. The most common example is a Michaelis-Menten graph from enzyme kinetics studies(substrate concentration vS. reaction rate). Since most computers and programs have different methods for dealing with nonlinearity, it is probably best not to connect the data points with a line. Rather, use a curve-fitting routine to get the appropriate line. Alternatively, one could analyze the data using a straight-line method such as the Lineweaver-Burk plot
294 实验十五 Using the computer in biochemical research Ⅰ. Introduction and theory The modern computer has revolutionized the way we live. Not surprisingly, the computer has also changed the way we do biochemical research. Your first encounter with a computer in this laboratory will probably be while using an instrument that has a computer to control its operation, to collect data, and to analyze data. All major pieces of scientific equipment including UV-VIS spectrometers, high-performance liquid chromatographs, gas chromatographs, nuclear magnetic resonance spectrometers, and DNA sequencers are now controlled by computers. But your use of the computer will not end in the lab. You will use a computer to prepare each laboratory report including graphical analysis of experimental data. If the computer is connected to the Internet, you will greatly broaden its use to some of the following: (1) searching the biochemical literature for pertinent books and journal articles and (2) accessing biological databases that provide nucleic acid and protein sequences and protein structures. Personal computing in biochemistry It is now possible for most students to purchase a basic computer system at low cost. Some recommendations for specific hardware and software will be given here, but one must be aware that new products and important upgrades are continually being developed. For word processing(writing lab reports), basic software programs including Microsoft Word and Word Perfect are most widely used. Software specialized for scientific writing is available but probably not necessary at this level. For many experiments that you complete, you will need to present data in a spreadsheet or graphing or spreadsheet with graphing capability include Lotus, Excel, Sigmaplot, Quattropro, Kaleidagraph, and CricketCraph. Some graphs that you prepare from experimental data will be nonlinear. The most common example is a Michaelis-Menten graph from enzyme kinetics studies(substrate concentration vs. reaction rate). Since most computers and programs have different methods for dealing with nonlinearity, it is probably best not to connect the data points with a line. Rather, use a curve-fitting routine to get the appropriate line. Alternatively, one could analyze the data using a straight-line method such as the Lineweaver-Burk plot
The computer and the internet If you are using the computer as described above, you are saving time and preparing lab reports. Howe ifi cted to the Internet then you are not tapping into the vast wealth of biochemical tools and information available. The Internet can be defined, in simple terms, as a world wide matrix that allows all computers and networks to communicate with each other If the computer you are using is college owned, then it is probably already linked to the Internet. For your own home computer, you may need to subscribe to an Internet service and in a n signals through a telephone line. Once you are connected to the Internet, many programs are available as freeware, software provided without charge by its creator. After you are connected to the Internet, what are the basic facilities available for use? First, you will be able to communicate by e-mail(electronic mail). Messages containing text, files, and graphics may be sent to anyone who has a computer with an Internet link and an e-mail address. Addresses have three basic components, the user name, an@ sign, and the user's location or domain. Common domains that you will encounter usually have one of the following suffixes: edu(educational institution in the United States), ac(academic institution in the United Kingdom) gov(government), com(commercial organization), and org(other organization). You will need an e-mail program to collect, send, and organize messages. The most popular ones are Eudora and Pegasus. (Practice your e-mail skills by sending a message, perhaps a question, to your laboratory instructor). Connected to the Internet, you will also be able to join in list server discussion groups created to share ideas in a commo area of interest or in news groups such as USENET. One of the most widely used facilities on the Internet is the ability to place and retrieve network data by file transfer protocol( ftp The World wide web The newest and most rapidly growing component of the Internet is the World wide Web(www, also called"the web"). This facility, which was launched in 1992, permits the transfer of data as pages in multimedia form consisting of text, graphs, audio, and video. The pages are linked together by hypertext pointers so that data stored on computers in different locations may be retrieved via the network by your computer Web documents are written in a special coded language called HyperText Markup Language(HTML. To access all of the
295 The computer and the internet If you are using the computer as described above, you are saving time and preparing good-looking lab reports. However, if your computer is not connected to the Internet, then you are not tapping into the vast wealth of biochemical tools and information available. The Internet can be defined, in simple terms, as a worldwide matrix that allows all computers and networks to communicate with each other. If the computer you are using is college owned, then it is probably already linked to the Internet. For your own home computer, you may need to subscribe to an Internet service and obtain a modem to transmit computer signals through a telephone line. Once you are connected to the Internet, many programs are available as freeware, software provided without charge by its creator. After you are connected to the Internet, what are the basic facilities available for use? First, you will be able to communicate by e-mail(electronic mail). Messages containing text, files, and graphics may be sent to anyone who has a computer with an Internet link and an e-mail address. Addresses have three basic components, the user name, an @ sign, and the user’s location or domain. Common domains that you will encounter usually have one of the following suffixes: edu(educational institution in the United States), ac(academic institution in the United Kingdom), gov(government), com(commercial organization), and org(other organization). You will need an e-mail program to collect, send, and organize messages. The most popular ones are Eudora and Pegasus. (Practice your e-mail skills by sending a message, perhaps a question, to your laboratory instructor). Connected to the Internet, you will also be able to join in list server discussion groups created to share ideas in a common area of interest or in news groups such as USENET. One of the most widely used facilities on the Internet is the ability to place and retrieve network data by file transfer protocol(ftp). The World Wide Web The newest and most rapidly growing component of the Internet is the World Wide Web(WWW, also called “the web”). This facility, which was launched in 1992, permits the transfer of data as pages in multimedia form consisting of text, graphs, audio, and video. The pages are linked together by hypertext pointers so that data stored on computers in different locations may be retrieved via the network by your computer. Web documents are written in a special coded language called HyperText Markup Language(HTML). To access all of the
resources on the Web, you will need a browser, an interface program that reads hypertext and displays Web pages on your computer. The most commonly used Web browsers are Internet Explorer and Netscape Navigator. To access the Web, the Web browser is activated. Displayed on the screen will be the home page or starting point for entry into the Web. On this page will be a dialogue box into hich you can type text. The dialogue box may ask for“ Address”,“ Netsite”,“ Location”,or "URL"(Uniform Resource Locator), To request a specific Web page from another computer sitetypeintheWebpageaddresswhichisusuallyintheformhttp:/www..thEhomepage with instructions on the use of the Web site, will then be displayed on the screen. One important feature you will note is that some words on the page are highlighted. If you click the mouse on one of these words(called hyperlinks) your computer will connect to another related, Web page that provides information on the hyperlink. This feature greatly enhances the use of the Web because related Web sites are connected or linked together and may be quickly accessed by a click of the mouse. Web addresses that useful for biochemical research are presented in Tables ElI and El. 2. Many of the current Web sites you will need are listed here; however, what about new Web sites that have been established since publication of this book? Millions of new Web sites are created every year. To access these new sites, you need the help of a search engine, a searchable directory that organizes Web pages by subject classification. Major search engines include Alta Vista, Excite, HotBot, Lycos, Netscape Search, and Yahoo As you"surf the Web", you may find sites you wish to save and review at a later date. You may use the “ bookmark”( Netscape)or“ favorite”( Explorer) function to save it for the future Application of the Web It is not necessary to have a complete understanding of the Internet in order to ta vast resources. The fundamental concepts provided here will allow you to take advantage of two essential activities: (1)biochemical literature searching and (2)using Web directories and biological databases. The biochemical literature Experimental biochemists do not spend all their working time in the laboratory. An important component of a biochemistry research project is a search of the biochemical literature. The library should be considered a tool for experimental biochemistry in the same ay as any scientific instrument
296 resources on the Web, you will need a browser, an interface program that reads hypertext and displays Web pages on your computer. The most commonly used Web browsers are Internet Explorer and Netscape Navigator. To access the Web, the Web browser is activated. Displayed on the screen will be the home page or starting point for entry into the Web. On this page will be a dialogue box into which you can type text. The dialogue box may ask for “Address”, “Netsite”, “Location”, or “URL”(Uniform Resource Locator). To request a specific Web page from another computer site, type in the Web page address, which is usually in the form http://www.-. The home page, with instructions on the use of the Web site, will then be displayed on the screen. One important feature you will note is that some words on the page are highlighted. If you click the mouse on one of these words(called hyperlinks) your computer will connect to another, related, Web page that provides information on the hyperlink. This feature greatly enhances the use of the Web because related Web sites are connected or linked together and may be quickly accessed by a click of the mouse. Web addresses that useful for biochemical research are presented in Tables E1.1 and E1.2. Many of the current Web sites you will need are listed here; however, what about new Web sites that have been established since publication of this book? Millions of new Web sites are created every year. To access these new sites, you need the help of a search engine, a searchable directory that organizes Web pages by subject classification. Major search engines include AltaVista, Excite, HotBot, Lycos, Netscape Search, and Yahoo! As you “surf the Web”, you may find sites you wish to save and review at a later date. You may use the “bookmark”(Netscape) or “favorite”(Explorer) function to save it for the future. Application of the Web It is not necessary to have a complete understanding of the Internet in order to tap into its vast resources. The fundamental concepts provided here will allow you to take advantage of two essential activities: (1) biochemical literature searching and (2) using Web directories and biological databases. The biochemical literature Experimental biochemists do not spend all their working time in the laboratory. An important component of a biochemistry research project is a search of the biochemical literature. The library should be considered a tool for experimental biochemistry in the same way as any scientific instrument
Table el.1 web database directories URL Pedro’s Biomolecularhttp://www.public.iastate.edu/-pedro/research-tools.html Research tools Biology workbench http://biologv.ncsauiuc.edu CmsMolecularBiologyIhttp://www.sdsc.edu/restools/cmshp.html Resources BioTech http://biotech.icmb.utexas.edu Protocol Online http://www.protocol-online.net Chem connection httpichEmconnect.com/news/iournals.html AmericanChemicalSocietyhttp://pubs.acsorg/ Table el. 2 biochemical Databases and tools Name Description URL ProteinDataBank(pdb)Proteinstructureshttp://www.rcsb.org/pdbl ed by x-ray and NMr European Bioinformatics DNA sequences http://www.ebi.ac.uk/ Institute(EBD) CenterforVarietyofdatabaseshttp://www.nlm.nihgov/ Biotechnology and resources Information(NCBi) wiss- Protein Proteinsequenceshttp://www.expasy.ch/tools/ and analysis Biocatalysis/Biodegradation Microbial http://www.labmed.umnedu Databases of the University metabolism of many umbbd/index. html of minnesota chemicals RebaSe-thERestrictionrestrictionenzymehttp://rebase.neb.com/ Enzyme Database directory and action GeorgiainStituteofTutorialsonpdBhttp://www.chemistry.gatech.edw/ Technology and RasMol faculty/williams/b Course information/4582/labs/rasmol- db. html
297 Table E1.1 Web Database Directories Name URL Pedro’s Biomolecular Research Tools http://www.public.iastate.edu/~pedro/research-tools.html Biology Workbench http://biology.ncsa.uiuc.edu CMS Molecular Biology Resources http://www.sdsc.edu/ResTools/cmshp.html BioTech http://biotech.icmb.utexas.edu Protocol Online http://www.protocol-online.net Chem Connection http://chemconnect.com/news/journals.html American Chemical Society http://pubs.acs.org/ Table E1.2 Biochemical Databases and Tools Name Description URL Protein Data Bank(PDB) Protein structures determined by X-ray and NMR http://www.rcsb.org/pdb/ European Bioinformatics Institute(EBI) DNA sequences http://www.ebi.ac.uk/ National Center for Biotechnology Information(NCBI) Variety of databases and resources http://www.nlm.nih.gov/ Swiss-Protein Protein sequences and analysis http://www.expasy.ch/tools/ Biocatalysis/Biodegradation Databases of the University of Minnesota Microbial metabolism of many chemicals http://www.labmed.umn.edu/ umbbd/index.html REBASE-The Restriction Enzyme Database Restriction enzyme directory and action http://rebase.neb.com/ Georgia Institute of Technology Tutorials on PDB and RasMol http://www.chemistry.gatech.edu/ faculty/Williams/bCourseinformation/4582/labs/rasmolpdb.html
TheInstituteforGenomiccoLlectionofgenomichttp://www.tigr.org sear RasMol(ras mac) Moleculargraphicshttp://www.umass.edu/ Predict Protein Proteinsequenceandhttp://www.embl-heidelberg.de/ redicti predictprotein/ Gene quiz Protein functionhttp://www.sander.ebi.dc.uk/ analysis based on gqsrv/submit equence The use of the biochemical literature by the student in biochemistry laboratory is not extensive as that of a full-time researcher. but you must be aware of what is available in the library and how to use it The library is used in all stages of research. Before an investigator can begin experimentation, a research idea must be generated. This idea develops only after extensive reading and study of the literature. A research project usually begins in the form of a question to be answered or problem to be solved. For ease of solution, a major project is subdivided into questions that may be answered by experimentation Before laboratory work can begin, the researcher must have a knowledge of the past and current literature dealing with the research area. This can be reduced to two questions: What is the current state of knowledge in the area? And what are the significant unknowns? These questions can be answered only by developing a familiarity with the biochemical literature. The researcher will find that this knowledge of the literature is also invaluable for the design of experiments. The development of experiments requires knowledge of techniques and laboratory procedures. Excellent methods books and journals are available that provide experimenta details. Finally, while performing experiments, the researcher often needs physical and hemical constants and miscellaneous information. Various handbooks and encyclopedias are excellent for this purpose. The beginning student in biochemistry laboratory will not be expected to proceed through all of these stages in the design of an experiment. However, a familiarity with the literature will increase your understanding of the experiment and may aid in the development of more effective methods. When you do begin a research program, you will be able to use the library to the fullest advantage. The biochemical literature is massive and expanding rapidly. It is almost a full-time job just to maintain a current awareness of a specialized research area. There are few disciplinary boundaries in the study of biochemistry. The biochemical literature overlaps into the biological sciences, the physical sciences, and the basic medical sciences The intent of the following discussion is to bring some order to the many textbooks, reference books, 298
298 The Institute for Genomic Research Collection of genomic databases http://www.tigr.org/ RasMol(RasMac) Molecular graphics for proteins http://www.umass.edu/ microbio/rasmol/ Predict Protein Protein sequence and structure prediction http://www.embl-heidelberg.de/ predictprotein/ Gene Quiz Protein function analysis based on sequence http://www.sander.ebi.dc.uk/ gqsrv/submit The use of the biochemical literature by the student in biochemistry laboratory is not as extensive as that of a full-time researcher, but you must be aware of what is available in the library and how to use it. The library is used in all stages of research. Before an investigator can begin experimentation, a research idea must be generated. This idea develops only after extensive reading and study of the literature. A research project usually begins in the form of a question to be answered or problem to be solved. For ease of solution, a major project is subdivided into questions that may be answered by experimentation. Before laboratory work can begin, the researcher must have a knowledge of the past and current literature dealing with the research area. This can be reduced to two questions: What is the current state of knowledge in the area? And what are the significant unknowns? These questions can be answered only by developing a familiarity with the biochemical literature. The researcher will find that this knowledge of the literature is also invaluable for the design of experiments. The development of experiments requires knowledge of techniques and laboratory procedures. Excellent methods books and journals are available that provide experimental details. Finally, while performing experiments, the researcher often needs physical and chemical constants and miscellaneous information. Various handbooks and encyclopedias are excellent for this purpose. The beginning student in biochemistry laboratory will not be expected to proceed through all of these stages in the design of an experiment. However, a familiarity with the literature will increase your understanding of the experiment and may aid in the development of more effective methods. When you do begin a research program, you will be able to use the library to the fullest advantage. The biochemical literature is massive and expanding rapidly. It is almost a full-time job just to maintain a current awareness of a specialized research area. There are few disciplinary boundaries in the study of biochemistry. The biochemical literature overlaps into the biological sciences, the physical sciences, and the basic medical sciences The intent of the following discussion is to bring some order to the many textbooks, reference books
research journals, computer information retrieval services, and handbooks that are available Reference books and review publications For more specialized and detailed biochemical information that is not offered by textbooks must be used. Reference works range from general surveys to specialized series. The best works are multivolume sets that continue publication of volumes on a periodic basis. Each volume usually covers a specialized area with articles written by recognized authorities in the field. It should be noted that reference articles of interest to biochemists are often found in publications that are not strictly biochemical. The best known and most widely used review publication is Annual Review of Biochemistry: Each volume in this series, which was introduced in 1932, contains several detailed and extensive articles written by experts in the field. For shorter reviews emphasizing current topics, Trends in the biochemical Sciences(tiBs) is widely read. Research journals The core of the biochemical literature consists of research journals. It is essential for a practicing biochemist to maintain a knowledge of biochemical advances in his or her field of research and related areas. Scores of research journals are published with the intent of keeping scientists up to date. with the expansion of scientific information has come the need for efficient storage and use of research journals. Many publishers are now providing journals in forms such as microcards, microfilm, microfiche, and more recently CD-ROM disks and on line. Som Is have achieved an e and articles therein are considered to be of the highest quality. A recent ranking of the biochemical journals, based on the number of citations received, produced the following order for the top six: Journal of Biological Chemistry,, Biochimica ef Biophysica Acta, Biochemistry, Proceedings of the National Academy of Sciences of the United states of america, Biochemical Journal, and Biochemical and Biophysical Research Communications. The core journals used by an individual depend on the area of specialty and are best determined from ex perience. Methodology References The active researcher has a continuing need for new methods and techniques. Several ublications specialize in providing details of research methods, and many research methods are now available on the Web. Some of the useful biochemical methodology publications are:
299 research journals, computer information retrieval services, and handbooks that are available. Reference books and review publications For more specialized and detailed biochemical information that is not offered by textbooks must be used. Reference works range from general surveys to specialized series. The best works are multivolume sets that continue publication of volumes on a periodic basis. Each volume usually covers a specialized area with articles written by recognized authorities in the field. It should be noted that reference articles of interest to biochemists are often found in publications that are not strictly biochemical. The best known and most widely used review publication is Annual Review of Biochemistry. Each volume in this series, which was introduced in 1932, contains several detailed and extensive articles written by experts in the field. For shorter reviews emphasizing current topics, Trends in the Biochemical Sciences(TIBS) is widely read. Research Journals The core of the biochemical literature consists of research journals. It is essential for a practicing biochemist to maintain a knowledge of biochemical advances in his or her field of research and related areas. Scores of research journals are published with the intent of keeping scientists up to date. With the expansion of scientific information has come the need for efficient storage and use of research journals. Many publishers are now providing journals in forms such as microcards, microfilm, microfiche, and more recently CD-ROM disks and on line. Some research journals have achieved an especially excellent reputation, and articles therein are considered to be of the highest quality. A recent ranking of the biochemical journals, based on the number of citations received, produced the following order for the top six: Journal of Biological Chemistry, Biochimica et Biophysica Acta, Biochemistry, Proceedings of the National Academy of Sciences of the United States of America, Biochemical Journal, and Biochemical and Biophysical Research Communications. The core journals used by an individual depend on the area of specialty and are best determined from experience. Methodology References The active researcher has a continuing need for new methods and techniques. Several publications specialize in providing details of research methods, and many research methods are now available on the Web. Some of the useful biochemical methodology publications are:
Analytical Biochemistry, a monthly journal. Analytical Chemistry, a monthly journa Biochemical Preparations, an annual volume. Current Protocols in Molecular Biology, P. Ausabel ef aL, Editors. A manual of techniques in two volumes that are updated quarterly Laboratory Techniques in Biochemistry and Molecular Biology T.S. Work and R. G. Burdon, Editors(formerly T. s. Work and E. Work). Each volume in the series is concentrated in an area of biochemistry and written by recognized authorities. Methods of Enzymatic Analysis, H. Bergmeyer, Editor Contains methods for enzyme purification and assay, in several volumes. Methods in Enzymology various editors. The most valuable methods series available. Each volume contains numerous articles describing biochemical techniques. The series is ell indexed and easy to use. Over 200 volumes. A Practical Guide to Molecular Cloning, 2nd ed, B Perbal Useful for setting up research projects in molecular cloning Computer-based searches and other aids to the literature As you study and work in the biochemistry, you will often need to complete a thorough literature search on some specialized area or topic. It is not practical to survey the hundreds of books, journals, and reports that may contain information related to the topic. Two publications that provide brief summaries of published articles, reviews, and patents are Chemical abstracts and Biological abstracts Research articles of interest to biochemists may appear in many types of research journals. Research libraries do not have the funds necessary to subscribe to every journal, nor do scientists have the time to survey every current journal copy for articles of interest Iwo publications that help scientists to keep up with published articles are Chemical Titles(published every 2 weeks by the American Chemical Society)and the weekly Current Contents available in hard copy and computer disks(published by the Institute of Science Information). The life Science edition of Current Contents is the most useful for biochemists. The computer revolution has reached into the chemical and biochemical literature and most college and university libraries now subscribe to computer bibliographic search services. One such service is STN International the scientific and technical information network. this on-line system allows direct access to some of the worlds largest scientific databases. The stn databases of most value to life scientists include bioSis Previews/RN(produced by bio Sciences Information Service; covers original research reports, reviews, and U.S. patents in biology and biomedicine), CA(produced by Chemical Abstracts service, covers research reports in all areas of chemistry), MEDLINE, and MEDLARS(produced by the U.s
300 Analytical Biochemistry, a monthly journal. Analytical Chemistry, a monthly journal. Biochemical Preparations, an annual volume. Current Protocols in Molecular Biology, P. Ausabel et al., Editors. A manual of techniques in two volumes that are updated quarterly. Laboratory Techniques in Biochemistry and Molecular Biology, T. S. Work and R. G. Burdon, Editors(formerly T. S. Work and E. Work). Each volume in the series is concentrated in an area of biochemistry and written by recognized authorities. Methods of Enzymatic Analysis, H. Bergmeyer, Editor. Contains methods for enzyme purification and assay, in several volumes. Methods in Enzymology, various editors. The most valuable methods series available. Each volume contains numerous articles describing biochemical techniques. The series is well indexed and easy to use. Over 200 volumes. A Practical Guide to Molecular Cloning, 2 nd ed., B. Perbal. Useful for setting up research projects in molecular cloning. Computer-based searches and other aids to the literature As you study and work in the biochemistry, you will often need to complete a thorough literature search on some specialized area or topic. It is not practical to survey the hundreds of books, journals, and reports that may contain information related to the topic. Two publications that provide brief summaries of published articles, reviews, and patents are Chemical Abstracts and Biological Abstracts. Research articles of interest to biochemists may appear in many types of research journals. Research libraries do not have the funds necessary to subscribe to every journal, nor do scientists have the time to survey every current journal copy for articles of interest. Two publications that help scientists to keep up with published articles are Chemical Titles(published every 2 weeks by the American Chemical Society) and the weekly Current Contents available in hard copy and computer disks(published by the Institute of Science Information). The Life Science edition of Current Contents is the most useful for biochemists. The computer revolution has reached into the chemical and biochemical literature, and most college and university libraries now subscribe to computer bibliographic search services. One such service is STN International, the scientific and technical information network. This on-line system allows direct access to some of the world’s largest scientific databases. The STN databases of most value to life scientists include BIOSIS Previews/RN(produced by Bio Sciences Information Service; covers original research reports, reviews, and U.S. patents in biology and biomedicine), CA(produced by Chemical Abstracts service, covers research reports in all areas of chemistry), MEDLINE, and MEDLARS(produced by the U.S
National Library of Medicine and Index Medicus, respectively; cover all areas of biomedicine). These networks provide on-line service and their data bases can be accessed rom personal computers in the office, laboratory, or library. Some of the computer bibliography services are freeware on the Internet, but others have user fees. For example, MEDLINE(PubMed) produced by the National library of Medicine, available http://www.ncbi.nimnih.govl,maybeusedfreeofcharge Web directories, tools, and databases Biochemical research generates huge amounts of data of interest to all scientists. For example, thousands of genes and proteins have been sequenced during the past several years and thousands more will be sequenced in the future. This number is being greatly expanded by the human Genome Project, which has as its goal the sequencing of the entire human genome. In addition, determining the structures of proteins by X-ray diffraction and by NMR has become routine. Sequence and structural data are now being stored in computer networks for retrieval by biochemists throughout the world. Here, we will discuss the many biological databases and provide examples of their use Our approach will be to focus on the use of databases readily available, free of charge, on the Web. However, it is important to recognize that many commercial hardware and software systems for analyzing biological database are available but they often very expensive and complicated to use. A wide variety of databases are currently available including bibliographic, nucleic acid sequence, protein sequence and structure, metabolic pathways, transcription factors, enzymes, and many others. One of the best ways to find the resources suited to your needs is to use a directory that collects lists of information, tools, and other services. Several very good ones are available(Table El. 1 ) Some of these sites are hyperlinked to the database sites. This experiment will introduce you to some of the more general and useful sites. Specifically they will include protein primary, secondary, and tertiary structure, sequence homology, sequence alignment, and structure prediction. The Web addresses for these resources are listed in Table El. 2. Because of the huge amount of data available, it is often necessary to use programs to help you analyze the data. Table E1. 3 lists several software programs that are available and usually hyperlinked to the database sites. Those that we will introduce in this experiment are FASTA(protein amino acid sequences), BLAST(comparing protein sequence data), RasMol or Ras Mac(coordinates for protein structure manipulation), Chime(protein structure coordinates), SWISs-MODEL(protein modeling)h, VAST(prote structure similarities), and Molecules R Us(protein structure coordinates)
301 National Library of Medicine and Index Medicus, respectively; cover all areas of biomedicine). These networks provide on-line service and their databases can be accessed from personal computers in the office, laboratory, or library. Some of the computer bibliography services are freeware on the Internet, but others have user fees. For example, MEDLINE(PubMed) produced by the National Library of Medicine, available at http://www.ncbi.nlm.nih.gov/, may be used free of charge. Web directories, tools, and databases Biochemical research generates huge amounts of data of interest to all scientists. For example, thousands of genes and proteins have been sequenced during the past several years and thousands more will be sequenced in the future. This number is being greatly expanded by the Human Genome Project, which has as its goal the sequencing of the entire human genome. In addition, determining the structures of proteins by X-ray diffraction and by NMR has become routine. Sequence and structural data are now being stored in computer networks for retrieval by biochemists throughout the world. Here, we will discuss the many biological databases and provide examples of their use. Our approach will be to focus on the use of databases readily available, free of charge, on the Web. However, it is important to recognize that many commercial hardware and software systems for analyzing biological database are available, but they often very expensive and complicated to use. A wide variety of databases are currently available including bibliographic, nucleic acid sequence, protein sequence and structure, metabolic pathways, transcription factors, enzymes, and many others. One of the best ways to find the resources suited to your needs is to use a directory that collects lists of information, tools, and other services. Several very good ones are available(Table E1.1). Some of these sites are hyperlinked to the database sites. This experiment will introduce you to some of the more general and useful sites. Specifically, they will include protein primary, secondary, and tertiary structure, sequence homology, sequence alignment, and structure prediction. The Web addresses for these resources are listed in Table E1.2. Because of the huge amount of data available, it is often necessary to use programs to help you analyze the data. Table E1.3 lists several software programs that are available and usually hyperlinked to the database sites. Those that we will introduce in this experiment are FASTA(protein amino acid sequences), BLAST(comparing protein sequence data), RasMol or RasMac(coordinates for protein structure manipulation), Chime(protein structure coordinates), SWISS-MODEL(protein modeling), VAST(protein structure similarities), and Molecules R Us(protein structure coordinates)
Overview of the experiment In this experiment, students will be introduced to several uses of the computer and the Internet. Students are instructed in the use of bibliographic searches, sequence databases, and structural analytical tools available, free of charge, on the web Table E1.3 Useful programs for exploring structures/sequences rogram Function BLAST Searches for similar protein and nuclei acid sequences Chime Protein structures on moving 3D Entrez(NcBi) Sequence retrieval system cross-referencing databases FASTA Searches for similar protein sequences Gen Bank(NCBi) Databases of gene sequences Molecules r us Provides coordinates for protein 3D structure and manipulation RasMol(ras mac) Provides coordinates for protein 3D structure and manipulation SRS(EMBL) Sequence retrieval vste cross-referencing databases I. Materials and supplies Computer: Apple Macintosh or PC with printer; connected to the Internet Software: Web browser such as Netscape navigator or Internet explorer; e-mail program such as Eudora. II. Experimental procedure 1. Searching the biochemical literature on medline To illustrate the use of this search service, point your Web browser to the appropriate Url(Http://www.nim.nihgov/ThiswillconnectyoutotheNationalCenterfor Biotechnology Information. Click the mouse on the hyperlink"PubMed". Select MEDLINE
302 Overview of the experiment In this experiment, students will be introduced to several uses of the computer and the Internet. Students are instructed in the use of bibliographic searches, sequence databases, and structural analytical tools available, free of charge, on the Web. Table E1.3 Useful programs for exploring structures/sequences Program Function BLAST Searches for similar protein and nucleic acid sequences Chime Protein structures on moving 3D coordinates Entrez(NCBI) Sequence retrieval system for cross-referencing databases FASTA Searches for similar protein sequences GenBank(NCBI) Databases of gene sequences Molecules R Us Provides coordinates for protein 3D structure and manipulation RasMol(Ras Mac) Provides coordinates for protein 3D structure and manipulation SRS(EMBL) Sequence retrieval system for cross-referencing databases Ⅱ. Materials and supplies Computer: Apple Macintosh or PC with printer; connected to the Internet. Software: Web browser such as Netscape Navigator or Internet Explorer; e-mail program such as Eudora. Ⅲ. Experimental procedure 1. Searching the biochemical literature on MEDLINE To illustrate the use of this search service, point your Web browser to the appropriate URL(http://www.nlm.nih.gov/) This will connect you to the National Center for Biotechnology Information. Click the mouse on the hyperlink “PubMed”. Select MEDLINE
in the upper dialogue box. Many features on display are available, but the most basic is the search capability. For bibliographic searching you may enter in the dialogue box under MEDLINE a search term, author name, or journal name. For example, you may want to type in“ bovine alpha-lactalbumin” licking on“ Search” will then provide over500 itations(or articles). The lists are composed of author(s), title and reference in reverse chronological order. By clicking on the author's name(in hypertext), you can retrieve the abstract of the article. Another useful and time-saving feature is the hypertext(see Related Articles) ". Clicking on this will provide a list of papers related to the specific citation. The 500 papers or so that you obtained in your original search are too many to screen; you may change the search parameters to reduce the number. Clicking on the "?"in the upper right-hand corner of the screen provides help for focusing the search process. 2. Using Web tools and biological databases Point your Web browser to the Protein Data Bank(PDB) and the research CollaboratoryforStructuralBioinformatics(http://www.rcsb.org/pdb/.Becomeacquainted with the PDB by viewing the home page and perhaps clicking ome hyperlinks. Scroll until you find the term "Searchlite "under Search on the right side of the screen. Clicking on Searchlite will display a dialogue box for keywords. Type in"human alpha-lactalbumin"and click on Search. Your query will find at least seven structures that are listed. Click on the white square to the left and"EXPLORE to the right of Structure 1A4V. This will display Structure Explorer" with"Summary Information"about the structure of the protein. Clicking on the "?"will provide help if necessary. Review the functions possible on the left side of the screen. Click on“ View Structure" to observe“ Interactive3 D Display”and“stil Images". First, study the still images of human alpha-lactalbumin in ribbon or cylinder form. You may click on 250x250 or 500x500 to enlarge. Note the presence of a-helices and B-sheets in the structure After studying the still images, click on "Chime "under Interactive 3D Display. Now, you will observe the ribbon structure rotating on an axis. Use"Chime Help "at the bottom of the screen to learn mouse Controls of the rotating structure Now return to the Summary Information list to try other functions. Click on"Sequence Details"to observe the amino acid sequence and definition of secondary structures. You may do an ftp download of this file by clicking on"Download in FASTA format". FASTA format is a listing of amino acid sequences using the standard single-letter abbreviation for each amino acid. Clicking on "Geometry"will display tables of bond angles and lengths. Similar sequence studies may be done by clicking on the function"Structural Neighbors". Several tools are available to search for similar structures. Try the VAST tooL Clicking on"VAST"will provide two options, Sequence Neighbors and Structure Neighbors. Clicking on"Sequence Neighbors single chain"will display a list of many proteins with sequences similar to that of human alpha-lactalbumin. Note that most are alpha-lactalbumins from other species, but if you
303 in the upper dialogue box. Many features on display are available, but the most basic is the search capability. For bibliographic searching you may enter in the dialogue box under MEDLINE a search term, author name, or journal name. For example, you may want to type in “bovine alpha-lactalbumin”. Clicking on “Search” will then provide over 500 citations(or articles). The lists are composed of author(s), title, and reference in reverse chronological order. By clicking on the author’s name(in hypertext), you can retrieve the abstract of the article. Another useful and time-saving feature is the hypertext “(see Related Articles)”. Clicking on this will provide a list of papers related to the specific citation. The 500 papers or so that you obtained in your original search are too many to screen; you may change the search parameters to reduce the number. Clicking on the “?” in the upper right-hand corner of the screen provides help for focusing the search process. 2. Using Web tools and biological databases Point your Web browser to the Protein Data Bank(PDB) and the Research Collaboratory for Structural Bioinformatics(http://www.rcsb.org/pdb/). Become acquainted with the PDB by viewing the home page and perhaps clicking on some hyperlinks. Scroll until you find the term “Searchlite” under Search on the right side of the screen. Clicking on Searchlite will display a dialogue box for keywords. Type in “human alpha-lactalbumin” and click on Search. Your query will find at least seven structures that are listed. Click on the white square to the left and “EXPLORE” to the right of Structure 1A4V. This will display “Structure Explorer” with “Summary Information” about the structure of the protein. Clicking on the “?” will provide help if necessary. Review the functions possible on the left side of the screen. Click on “View Structure” to observe “Interactive 3D Display” and “Still Images”. First, study the still images of human alpha-lactalbumin in ribbon or cylinder form. You may click on 250x250 or 500x500 to enlarge. Note the presence of -helices and -sheets in the structure. After studying the still images, click on “Chime” under Interactive 3D Display. Now, you will observe the ribbon structure rotating on an axis. Use “Chime Help” at the bottom of the screen to learn Mouse Controls of the rotating structure. Now return to the Summary Information list to try other functions. Click on “Sequence Details” to observe the amino acid sequence and definition of secondary structures. You may do an ftp download of this file by clicking on “Download in FASTA format”. FASTA format is a listing of amino acid sequences using the standard single-letter abbreviation for each amino acid. Clicking on “Geometry” will display tables of bond angles and lengths. Similar sequence studies may be done by clicking on the function “Structural Neighbors”. Several tools are available to search for similar structures. Try the VAST tool. Clicking on “VAST” will provide two options, Sequence Neighbors and Structure Neighbors. Clicking on “Sequence Neighbors: single chain” will display a list of many proteins with sequences similar to that of human alpha-lactalbumin. Note that most are alpha-lactalbumins from other species, but if you