Nw.cab.zju.edu.cn/cab/ xueyuanxiashubumen/nx/ bioinplant.htm《生物信息学札记》樊龙江 A numerical measure, falling between-1 and 1, of the degree of the linear relationship between two variables. A positive value indicates a direct relationship, a negative value indicates an inverse relationship, and the distance of the value away from zero indicates the strength of the relationship A value near zero indicates no relationship between the variables Covariation( In sequences)(共变) Coincident change at two or more sequence positions in related sequences that may influence the secondary structures of RNa or protein molecules Coverage( or depth)(覆盖率/厚度) The average number of times a nucleotide is represented by a high-quality base in a collection of random raw sequence. Operationally, a high-quality base is defined as one with an accuracy of at least 99%(corresponding to a PHRED score of at least 20) Database(数据库) A computerized storehouse of data that provides a standardized way for locating, adding, removing, and changing data. See also object-oriented database. Relational database Dendogram A form of a tree that lists the compared objects(e.g, sequences or genes in a microarray analysis)in a vertical order and joins related ones by levels of branches extending to one side of the list Depth(厚度) See coverage Dirichlet mixtures Defined as the conjugational prior of a multinomial distribution. One use is for predicting the expected pattern of amino acid variation found in the match state of a hid-den Markov model (representing one column of a multiple sequence alignment of proteins), based on prior distributions found in conserved protein domains(blocks) Distance in sequence analysis(序列距高) The number of observed changes in an optimal alignment of two sequences. usually not counting gaps DNA Sequencing(DNA测序) The experimental process of determining the nucleotide sequence of a region of DNA. This is done by labelling each nucleotide(A, C, G or T)with either a radioactive or fluorescent marker which identifies it. There are several methods of applying this technology, each with their advantages and disadvantages. For more information, refer to a current text book. High throughput laboratories frequently use automated sequencers, which are mbers of templates. Sometimes, the sequences may be generated more quickly than they can be characterised Domain(功能域) a discrete portion of a protein assumed to fold independently of the rest of the protein and possessing its own functionwww.cab.zju.edu.cn/cab/xueyuanxiashubumen/nx/bioinplant.htm 《生物信息学札记》 樊龙江 A numerical measure, falling between - 1 and 1, of the degree of the linear relationship between two variables. A positive value indicates a direct relationship, a negative value indicates an inverse relationship, and the distance of the value away from zero indicates the strength of the relationship. A value near zero indicates no relationship between the variables. Covariation (in sequences)(共变) Coincident change at two or more sequence positions in related sequences that may influence the secondary structures of RNA or protein molecules. Coverage (or depth) (覆盖率/厚度) The average number of times a nucleotide is represented by a high-quality base in a collection of random raw sequence. Operationally, a 'high-quality base' is defined as one with an accuracy of at least 99% (corresponding to a PHRED score of at least 20). Database(数据库) A computerized storehouse of data that provides a standardized way for locating, adding, removing, and changing data. See also Object-oriented database, Relational database. Dendogram A form of a tree that lists the compared objects (e.g., sequences or genes in a microarray analysis) in a vertical order and joins related ones by levels of branches extending to one side of the list. Depth (厚度) See coverage Dirichlet mixtures Defined as the conjugational prior of a multinomial distribution. One use is for predicting the expected pattern of amino acid variation found in the match state of a hid-den Markov model (representing one column of a multiple sequence alignment of proteins), based on prior distributions found in conserved protein domains (blocks). Distance in sequence analysis(序列距离) The number of observed changes in an optimal alignment of two sequences, usually not counting gaps. DNA Sequencing (DNA 测序) The experimental process of determining the nucleotide sequence of a region of DNA. This is done by labelling each nucleotide (A, C, G or T) with either a radioactive or fluorescent marker which identifies it. There are several methods of applying this technology, each with their advantages and disadvantages. For more information, refer to a current text book. High throughput laboratories frequently use automated sequencers, which are capable of rapidly reading large numbers of templates. Sometimes, the sequences may be generated more quickly than they can be characterised. Domain (功能域) A discrete portion of a protein assumed to fold independently of the rest of the protein and possessing its own function. 130