version date: 1 December 2006 EXERCISE 12 HYDROPHOBICITY IN DRUG DESIGN Pietro Cozzini and francesca Spyrakis Laboratory of Molecular Modelling, Department of General and Inorganic Chemistry, Chemical-Physics and Analytical Chemistry, University of Parma 43100 Parma, Italy, Department of Biochemistry and Molecular Biology, University of Parma, 43100 Parma, Italy Hydrophobicity represents the tendency of a substance to repel water and to avoid the complete dissolution in water. The term"hydrophobic"means"water fearing,, from the Greek words hydro water,and phobo, fear. Being that hydrophobicity is one of the most important physicochemical parameters associated with chemical compounds, several studies have been carried out to understand, evaluate, and predict this parameter [1-8]. In fact, hydrophobicity governs numerous and different biological processes, such as, for example, transport, distribution, and metabolism of biological molecules; molecular recognition; and protein folding. Therefore, the knowledge of a parameter that describes the behavior of solutes into polar and nonpolar phases is essential to predict the transport and activity of drugs, pesticides, and xenobiotic The hydrophobic effect can be defined as" the tendency of nonpolar groups to cluster, shielding themselves from contact with an aqueous environment". The hydrophobic effect in proteins can also be described as the tendency of polar species to congregate in such a manner to maximize electrostatic interactions. Proteins, in fact, organize themselves to expose polar side-chains toward the solvent, and retain hydrophobic amino acid in a central hydrophobic core. The hydrophobic effect constitutes one of the main determinants of globular protein molecules structure and folding The hydrophilic regions tend to surround hydrophobic areas, which gather into the central hydrophobic core, generating a protein characterized by a specific and function-related three- dimensional structure. This driving force not only guides protein folding processes, but also any kind of biological interaction. Biological molecules interact, mainly, via electrostatic forces including hydrogen bonds or hydrogen-bonding networks, often formed through water molecules During a protein-ligand association, water molecules not able to properly locate themselves at the complex interface, are displaced and pushed into the bulk solvent, increasing entropy. Thus, it is possible to define the hydrophobic effect as a free energy phenomenon, constituted by both
1 EXERCISE I.12 HYDROPHOBICITY IN DRUG DESIGN Pietro Cozzini1 and Francesca Spyrakis2 1 Laboratory of Molecular Modelling, Department of General and Inorganic Chemistry, Chemical-Physics and Analytical Chemistry, University of Parma, 43100 Parma, Italy; 2 Department of Biochemistry and Molecular Biology, University of Parma, 43100 Parma, Italy Hydrophobicity represents the tendency of a substance to repel water and to avoid the complete dissolution in water. The term “hydrophobic” means “water fearing”, from the Greek words hydro, water, and phobo, fear. Being that hydrophobicity is one of the most important physicochemical parameters associated with chemical compounds, several studies have been carried out to understand, evaluate, and predict this parameter [1–8]. In fact, hydrophobicity governs numerous and different biological processes, such as, for example, transport, distribution, and metabolism of biological molecules; molecular recognition; and protein folding. Therefore, the knowledge of a parameter that describes the behavior of solutes into polar and nonpolar phases is essential to predict the transport and activity of drugs, pesticides, and xenobiotics. The hydrophobic effect can be defined as “the tendency of nonpolar groups to cluster, shielding themselves from contact with an aqueous environment”. The hydrophobic effect in proteins can also be described as the tendency of polar species to congregate in such a manner to maximize electrostatic interactions. Proteins, in fact, organize themselves to expose polar side-chains toward the solvent, and retain hydrophobic amino acid in a central hydrophobic core. The hydrophobic effect constitutes one of the main determinants of globular protein molecules structure and folding: The hydrophilic regions tend to surround hydrophobic areas, which gather into the central hydrophobic core, generating a protein characterized by a specific and function-related threedimensional structure. This driving force not only guides protein folding processes, but also any kind of biological interaction. Biological molecules interact, mainly, via electrostatic forces, including hydrogen bonds or hydrogen-bonding networks, often formed through water molecules. During a protein-ligand association, water molecules not able to properly locate themselves at the complex interface, are displaced and pushed into the bulk solvent, increasing entropy. Thus, it is possible to define the hydrophobic effect as a free energy phenomenon, constituted by both version date: 1 December 2006
version date: 1 December 2006 enthalpic and entropic phenomena [9 The hydrophobic character of different amino acids was deeply studied, and the possibility of creating amino acid hydrophobicity scales was pursued by several biochemical researchers with different methods and approaches [10, 11]. A complete understanding of the forces that guide amino acid interactions within proteins could lead to the prediction of protein structure and processes that drive a protein to fold into its native form The octanol/water partition coefficient(log Porw)constitutes a quantitative, and easily accessible, hydrophobicity measurement. P is defined as the ratio of the equilibrium concentration of a substance dissolved in a two-phase system, formed by two immiscible solvents PoN C water As a result, the partition coefficient P is the quotient of two concentrations and is normally calculated in the form of its logarithm to base 10 (log P), because P ranges from 10 to 10 Log p values are widely used in bio-accumulation studies, in drug absorption and toxicity predictions and, recently, even in biological interactions modeling [ 12, 13]. Several endeavours have been carried out to develop rapid and reliable log P estimation methodologies, capable of predicting the partition coefficient values for compounds not experimentally tested The common and standard procedure adopted for experimental log P estimation is the shake- flask method, used to determine the hydrophobicity of compounds ranging from -2 to 4 log P values. Log P>0 characterize hydrophobic substances soluble in the lipid phase, while log P
2 enthalpic and entropic phenomena [9]. The hydrophobic character of different amino acids was deeply studied, and the possibility of creating amino acid hydrophobicity scales was pursued by several biochemical researchers with different methods and approaches [10,11]. A complete understanding of the forces that guide amino acid interactions within proteins could lead to the prediction of protein structure and processes that drive a protein to fold into its native form. The octanol/water partition coefficient (log PO/W) constitutes a quantitative, and easily accessible, hydrophobicity measurement. P is defined as the ratio of the equilibrium concentration of a substance dissolved in a two-phase system, formed by two immiscible solvents: PO/W = water octanol c c As a result, the partition coefficient P is the quotient of two concentrations and is normally calculated in the form of its logarithm to base 10 (log P), because P ranges from 10–4 to 108 . Log P values are widely used in bio-accumulation studies, in drug absorption and toxicity predictions and, recently, even in biological interactions modeling [12,13]. Several endeavours have been carried out to develop rapid and reliable log P estimation methodologies, capable of predicting the partition coefficient values for compounds not experimentally tested. The common and standard procedure adopted for experimental log P estimation is the shakeflask method, used to determine the hydrophobicity of compounds ranging from –2 to 4 log P values. Log P > 0 characterize hydrophobic substances soluble in the lipid phase, while log P version date: 1 December 2006
version date: 1 December 2006 HYDROPHOBICITY measured as waterloctanol partition coefficient Pa octan [octanol log Pa = log water logP>0→ lipid phase ogP<0→ water phase Panel 1 Fi As an experimental alternative, high-performance liquid chromatography(HPLC)is used for more hydrophobic compounds ranging from 0 to 6 log P values. Log P can be experimentally measured, or predicted from structural data. Experimental measurements are often time-consuming and difficult to make, thus, the need to properly and rapidly estimate hydrophobic parameters more and more pressing. This need was also triggered by the advent of molecular modeling and the screening of large molecular libraries in the perspective of virtual screening and drug design Simultaneously, with new computational applications and molecular modeling progress and achievements, several methods, capable of predicting log P values for thousand of compounds, have been developed, and can now be classified into five major classes [14]: substituent methods fragments methods, methods based on atomic contribution and/or surface areas. methods based on molecular properties, and, finally, methods based on solvatochromic parameters The first"by substituent"approach was proposed by Fujita and coworkers in 1964 [15]. Their technique is based on the following equation log Px-log Pe where Px represents the partition coefficient of a derivative between 1-octanol and water and Ph that of the parent compound. Being that T typically is derived from equilibrium processes, it is possible to directly consider it as a free energy constant. As a consequence, log P represents additive-constitutive, free energy-related property, numerically equivalent to the sum of the parent log P compound, plus a T term, representing the log P difference between a determinate substituent and the hydrogen atom which邮 eplasednble/mt是88¥ tthe log P determination for
3 Panel 1 Fig. 1 As an experimental alternative, high-performance liquid chromatography (HPLC) is used for more hydrophobic compounds ranging from 0 to 6 log P values. Log P can be experimentally measured, or predicted from structural data. Experimental measurements are often time-consuming and difficult to make, thus, the need to properly and rapidly estimate hydrophobic parameters is more and more pressing. This need was also triggered by the advent of molecular modeling and the screening of large molecular libraries in the perspective of virtual screening and drug design. Simultaneously, with new computational applications and molecular modeling progress and achievements, several methods, capable of predicting log P values for thousand of compounds, have been developed, and can now be classified into five major classes [14]: substituent methods, fragments methods, methods based on atomic contribution and/or surface areas, methods based on molecular properties, and, finally, methods based on solvatochromic parameters. The first “by substituent” approach was proposed by Fujita and coworkers in 1964 [15]. Their technique is based on the following equation: π = log PX – log PH where PX represents the partition coefficient of a derivative between 1-octanol and water and PH that of the parent compound. Being that π typically is derived from equilibrium processes, it is possible to directly consider it as a free energy constant. As a consequence, log P represents an additive-constitutive, free energy-related property, numerically equivalent to the sum of the parent log P compound, plus a π term, representing the log P difference between a determinate substituent and the hydrogen atom which has been replaced [16]. As an example, the log P determination for water octanol water octanol [A] [A] log PA = log log P > 0 ⇒ lipid phase log P version date: 1 December 2006
version date: 1 December 2006 the methyl group is reported log PCH3=log P-log P The following"by fragments "methods was supported by Rekker and Mannhold, who stated that log P can be calculated as the sum of the fragment values plus certain correction factors. They determined the averaged contributions of simple fragments, using a large database of experimentally measured log P values [17, 18]. Rekker did not indicate which fragment could be considered a valid fragment. The log P of molecules can be calculated using the formula log p=>ann+>bmFm where a is the number of occurrences of fragment f of type n while b is the number of occurrences of correction factor F of type m The well-known CLOGP method clearly represents an improvement of the Rekker approach and in fact, can be expressed by the same equation. CLOGP program breaks molecules into fragments and sums these constant fragment values and structure-dependent correction values taken from Hansch and Leo's database, to predict log P of several organic molecules. The program divides the target molecule into different fragments following a set of simple rules not alterable by users CLOGP represents the first stand-alone program developed by Pomona MedChem, following Rekker general formulation. The program is now available on the Web (http://www.daylightcom/daycgi/clogp) Different from chemical group fragments, the methods based on atomic contribution and/or surface area use atomic fragments and surface area data to predict hydrophobicity. The contribution of each atom to a molecule, in terms of hydrophobicity, can be evaluated by multiplying the corresponding atomic parameter by the degree of exposure to the surrounding solvent. The exposure degree is typically represented by the solvent-accessible surface area(SASA). The first promoters of this method were Broto and his colleagues, who developed a 222 descriptors set, made by combinations of up to four atoms with specific bonding pathways up to four in length, reaching a precision of about 0.4 log units [19]. Later, the concept of sAsa was used by Iwase [20] and Dunn [21] in principal component analysis, to improve their log P estimations. Dunn computed the isotropic surface area, calculating the number of water molecules able to hydrate the polar portions of the solute molecules. As an example, one water molecule was allowed for groups as nitro, aniline, ketones, and tertiary amines, while two waters are allowed for other amines, three for arboxyls, and five for amide groups. The use of SASA parameters has been extended and introduced in several log P calculation algorithms, like the program HINT created by abraham and Kellogg in 1991, which will be subsequently discussed and used for a practical session Various researchers diglot. apree orgpu BreXioHs yareegrteda tragmestl methods, claiming that a
4 the methyl group is reported. log P CH3 = log P – log P The following “by fragments” methods was supported by Rekker and Mannhold, who stated that log P can be calculated as the sum of the fragment values plus certain correction factors. They determined the averaged contributions of simple fragments, using a large database of experimentally measured log P values [17,18]. Rekker did not indicate which fragment could be considered a valid fragment. The log P of molecules can be calculated using the formula log P = ∑anfn + ∑bmFm where a is the number of occurrences of fragment f of type n while b is the number of occurrences of correction factor F of type m. The well-known CLOGP method clearly represents an improvement of the Rekker approach and, in fact, can be expressed by the same equation. CLOGP program breaks molecules into fragments and sums these constant fragment values and structure-dependent correction values taken from Hansch and Leo’s database, to predict log P of several organic molecules. The program divides the target molecule into different fragments following a set of simple rules not alterable by users. CLOGP represents the first stand-alone program developed by Pomona MedChem, following Rekker general formulation. The program is now available on the Web (http://www.daylight.com/daycgi/clogp). Different from chemical group fragments, the methods based on atomic contribution and/or surface area use atomic fragments and surface area data to predict hydrophobicity. The contribution of each atom to a molecule, in terms of hydrophobicity, can be evaluated by multiplying the corresponding atomic parameter by the degree of exposure to the surrounding solvent. The exposure degree is typically represented by the solvent-accessible surface area (SASA). The first promoters of this method were Broto and his colleagues, who developed a 222 descriptors set, made by combinations of up to four atoms with specific bonding pathways up to four in length, reaching a precision of about 0.4 log units [19]. Later, the concept of SASA was used by Iwase [20] and Dunn [21] in principal component analysis, to improve their log P estimations. Dunn computed the isotropic surface area, calculating the number of water molecules able to hydrate the polar portions of the solute molecules. As an example, one water molecule was allowed for groups as nitro, aniline, ketones, and tertiary amines, while two waters are allowed for other amines, three for carboxyls, and five for amide groups. The use of SASA parameters has been extended and introduced in several log P calculation algorithms, like the program HINT created by Abraham and Kellogg in 1991, which will be subsequently discussed and used for a practical session. Various researchers did not agree with previously reported fragmental methods, claiming that a CH3 version date: 1 December 2006
version date: 1 December 2006 molecule is rarely a simple sum of its parts and prediction of any molecular property on empirical or calculated fragments has no scientific basis [22]. The Bodor's method computes log P as a function of different calculated molecular properties, like conformations, ionization, hydration, ion- pair formation, keto-enol tautomerism, intramolecular and intermolecular H-bond formation, folding and so forth The fifth log P determination method, based on solvatochromic comparisons, was proposed by Kamlet and coworkers [23] and constitutes, once more, a molecular properties methodology. Log P can be calculated through the following equation log Poct =aV+ bI*+c BH+ daH +e V is a solute volume term, T' is a polarity/polarizability solute term, BH is an independent measure of solute hydrogen-bond acceptor strength, aH the corresponding hydrogen-bond donor strength, while e is the intercept. T*, BH, and aH represent solvatochromic parameters obtained averaging multiple normalized solvent effects on a variety of properties, involving many different types of indicators Several research groups have tried to extend to amino acids the log P calculations, in order to better understand and investigate events like protein folding and biological interactions. However, experimental methods, like chromatography or site-directed mutagenesis, give ambiguous and different results [11]. Generally, each amino acid is characterized by a wide range of hydrophobicity values, thus, deciding and stating which value should correspond to a true measure becomes very difficult and time-consuming In order to obtain rapid and proper estimation of biological molecule hydrophobicity, in 1987 Abraham and Leo extended to common amino acids the fragment method of calculating partition coefficients [10]. Fundamental hydrophobic fragments, obtained from partitioning experiments performed on thousands of compounds, were subsequently reduced to atomic values with inherent bond, ring, chain, branching, and proximity factors. The derived hydrophobic atomic constants and the corresponding SASAs constituted the key parameter of the software HINT(Hydropathic INTeractions), able to directly calculate them for small molecules like ligands, or to obtain them from a residue-based dictionary. The program was thus created with the purpose of rapidly and properly estimating biological interactions such as protein-protein, protein-DNA, and protein- ligand and folding ph Why should we use log P to study and predict recognition and interactions between biological molecules? At least three reasonable answers could be given:(i) log P is essentially an experimental reproducible measurement; (i) partition experiments are low cost and perform relatively rapidly and (iii) log P is directly related to the free energy of binding. In fact, being that hydrophobicity is defined in terms of solubility, log Po/w, and consequently also the hydrophobic atomic constants
5 molecule is rarely a simple sum of its parts and prediction of any molecular property on empirical or calculated fragments has no scientific basis [22]. The Bodor’s method computes log P as a function of different calculated molecular properties, like conformations, ionization, hydration, ionpair formation, keto-enol tautomerism, intramolecular and intermolecular H-bond formation, folding, and so forth. The fifth log P determination method, based on solvatochromic comparisons, was proposed by Kamlet and coworkers [23] and constitutes, once more, a molecular properties methodology. Log P can be calculated through the following equation: log Poct = a V + b π* +c βH + d αH + e V is a solute volume term, π* is a polarity/polarizability solute term, βH is an independent measure of solute hydrogen-bond acceptor strength, αH the corresponding hydrogen-bond donor strength, while e is the intercept. π*, βH, and αH represent solvatochromic parameters obtained averaging multiple normalized solvent effects on a variety of properties, involving many different types of indicators. Several research groups have tried to extend to amino acids the log P calculations, in order to better understand and investigate events like protein folding and biological interactions. However, experimental methods, like chromatography or site-directed mutagenesis, give ambiguous and different results [11]. Generally, each amino acid is characterized by a wide range of hydrophobicity values, thus, deciding and stating which value should correspond to a true measure becomes very difficult and time-consuming. In order to obtain rapid and proper estimation of biological molecule hydrophobicity, in 1987 Abraham and Leo extended to common amino acids the fragment method of calculating partition coefficients [10]. Fundamental hydrophobic fragments, obtained from partitioning experiments performed on thousands of compounds, were subsequently reduced to atomic values with inherent bond, ring, chain, branching, and proximity factors. The derived hydrophobic atomic constants and the corresponding SASAs constituted the key parameter of the software HINT (Hydropathic INTeractions), able to directly calculate them for small molecules like ligands, or to obtain them from a residue-based dictionary. The program was thus created with the purpose of rapidly and properly estimating biological interactions such as protein–protein, protein–DNA, and protein– ligand and folding phenomena. Why should we use log P to study and predict recognition and interactions between biological molecules? At least three reasonable answers could be given: (i) log P is essentially an experimental reproducible measurement; (ii) partition experiments are low cost and perform relatively rapidly; and (iii) log P is directly related to the free energy of binding. In fact, being that hydrophobicity is defined in terms of solubility, log Po/w, and consequently also the hydrophobic atomic constants, version date: 1 December 2006
version date: 1 December 2006 implicitly enclose hydrophobic and solvation/desolvation effects, directly related to the entropic contribution involved in molecular associations. The formation of a complex between a protein and a ligand in aqueous solution can be represented by the following equilibrium 2a+La已PLa where P is the protein, L the ligand, P'l' the new complex, and k+1 and k-l are, respectively, the association and dissociation constants Both Ka(association)and Ka(dissociation) are related to the activity of the reacting species n, but, if extremely dilute solutions are considered, activities can be substituted by concentrations. Starting from the constant values it is possible to calculate the free energy of binding associated to the binding event, using the following relation △G°=- RTIn Kd T is the absolute temperature, R the gas constant and Ago the binding free energy variation measured in standard condition(298%K, I atm, and I M concentration for Po/w is also an equilibrium constant for solute transfer between octanol and water log Po/w=-△AG/2.303RT where r and t are constants It derives that log pol=k△G° where k =-0733 kcal mol-I at 298 K Because ∑a;= log pol is obvious the relationship between hydrophobic atomic constants ai and AG, thus, including both enthalpic and entropic contribution [ 9] HINT can be defined as a natural and intuitive force field, able to estimate, using experimentally determined log P values, not only the enthalpic but also the entropic effects included in noncovalent interactions, like hydrogen bonding, Coulombic forces, acid-base and hydrophobic contact Hydrophobic and polar contacts, both identified as hydropathic interactions, are strictly related to solvent partitioning phenomena. In fact, the solubilization of a ligand in a mixed solvent system
6 implicitly enclose hydrophobic and solvation/desolvation effects, directly related to the entropic contribution involved in molecular associations. The formation of a complex between a protein and a ligand in aqueous solution can be represented by the following equilibrium: Paq. + Laq. ⇄ P'L'aq.' where P is the protein, L the ligand, P'L' the new complex, and k+1 and k–1 are, respectively, the association and dissociation constants. Ka = Kd –1 = [ ] [ ][ ] P L PL Both Ka (association) and Kd (dissociation) are related to the activity of the reacting species n, but, if extremely dilute solutions are considered, activities can be substituted by concentrations. Starting from the constant values it is possible to calculate the free energy of binding associated to the binding event, using the following relation: ∆G° = –RT ln Kd T is the absolute temperature, R the gas constant and ∆G° the binding free energy variation measured in standard condition (298 °K, 1 atm, and 1 M concentration for both reagents and products). Po/w is also an equilibrium constant for solute transfer between octanol and water: log Po/w = –∆G°/2.303 RT where R and T are constants. It derives that log Po/w = k ∆G° where k ≈ –0.733 kcal mol–1 at 298 K. Because Σai = log Po/w it is obvious the relationship between hydrophobic atomic constants ai and ∆G°, thus, including both enthalpic and entropic contribution [9]. HINT can be defined as a natural and intuitive force field, able to estimate, using experimentally determined log P values, not only the enthalpic but also the entropic effects included in noncovalent interactions, like hydrogen bonding, Coulombic forces, acid-base and hydrophobic contacts. Hydrophobic and polar contacts, both identified as hydropathic interactions, are strictly related to solvent partitioning phenomena. In fact, the solubilization of a ligand in a mixed solvent system, k+1 k-1 version date: 1 December 2006
version date: 1 December 2006 like water and octanol, involves the same processes and atom-atom interactions as biomolecular interactions within or between proteins and ligands [24]. The program was designed to consider and investigate hydrophobicity and hydropathic interactions in several biological areas. HINT is able to (i calculate hydrophobic atomic constant for each atom in small molecule or even in macromolecule and quantitatively score molecular interactions, (ii) create hydrophobic maps or fields for small molecules in protein environments, (iii) map the hydrophobic and polar nature of the surrounding receptor from the structure of small interacting molecules, providing a hydrophobic interaction template for the definition of secondary and tertiary protein structure, and (iv)suggest modes of inter-helix interactions in trans-membrane ion channel [25]. All these features and capabilities make hinT a suitable tool, not only for the study of single and simple interactions, but also for the virtual screening of organic libraries and for structure-based drug design interactions between atom-atom couples are calculated using the following equation bj=ai Sia s Ti ri+ry where bi represents the interaction score between atoms i and j, a is the hydrophobic atomic constant, S is the SASA, Ti is a logic function assuming-l or +l value, depending on the character of the interacting polar atoms, while Ri and ri are a function of the distance between atoms i and The whole interaction between two molecules, like protein and ligand, or protein and DNA, can be represented as ΣΣb=Ea1 Siai si tii r+r bj>0 identifies favorable interactions, while bi
7 like water and octanol, involves the same processes and atom–atom interactions as biomolecular interactions within or between proteins and ligands [24]. The program was designed to consider and investigate hydrophobicity and hydropathic interactions in several biological areas. HINT is able to (i) calculate hydrophobic atomic constant for each atom in small molecule or even in macromolecule and quantitatively score molecular interactions, (ii) create hydrophobic maps or fields for small molecules in protein environments, (iii) map the hydrophobic and polar nature of the surrounding receptor from the structure of small interacting molecules, providing a hydrophobic interaction template for the definition of secondary and tertiary protein structure, and (iv) suggest modes of inter-helix interactions in trans-membrane ion channel [25]. All these features and capabilities make HINT a suitable tool, not only for the study of single and simple interactions, but also for the virtual screening of organic libraries and for structure-based drug design. Interactions between atom–atom couples are calculated using the following equation: bij = ai Si aj Sj Tij Rij + rij where bij represents the interaction score between atoms i and j, a is the hydrophobic atomic constant, S is the SASA, Tij is a logic function assuming –1 or +1 value, depending on the character of the interacting polar atoms, while Rij and rij are a function of the distance between atoms i and j. The whole interaction between two molecules, like protein and ligand, or protein and DNA, can be represented as ΣΣ bij = ΣΣ ai Si aj Sj Tij Rij + rij bij > 0 identifies favorable interactions, while bij version date: 1 December 2006
version date: 1 December 2006 Table 1 Compound HINT CLOG-P anthrace 4.45 4.49 1.3-butadiene n-butylamine 0.97 hexachlorobenzene 5.79 642 N-nitrosomorpholine 0.55 0.14 cortisone 0.49 testosterone 3.35 HINT PRACTICAL APPLICATIONS 1. PROTEIN-LIGAND INTERACTIONS Within an homogeneous biological set, HINT can be easily used to score and predict the free energy associated to protein-ligand complex formation. Starting from good crystallographic data and well experimentally determined Ki or ICso values (Table 2), it is possible to obtain linear relationships between experimental AG and computationally calculated HINT score values Table 2 reports the HiNT score protein-ligand values calculated for two different homogenous set, formed, respectively, by eight bovine trypsin-ligand complexes and by nine tryptophan synthase-ligand complexes, for which experimental inhibition constants are reported in literature Table 2 PDB code △G° bindin(kca/mo) Hint score ITNJ ITNI ITNG 3PTB PPH bovine trypsin 804 2663 CX9 2TRS 720 tryptophan synthethase tryptophan synthase 2TSY tryptophan synthethase
8 Table 1 Compound HINT CLOG-P anthracene 4.45 4.49 1,3-butadiene 1.76 1.90 n-butylamine 0.97 0.92 cyclopentane 2.94 2.80 hexachlorobenzene 5.79 6.42 N-nitrosomorpholine –0.41 –0.64 aldosterone 0.55 –0.14 cortisone 0.49 0.20 testosterone 3.35 3.35 HINT PRACTICAL APPLICATIONS 1. PROTEIN–LIGAND INTERACTIONS Within an homogeneous biological set, HINT can be easily used to score and predict the free energy associated to protein-ligand complex formation. Starting from good crystallographic data and well experimentally determined Ki or IC50 values (Table 2), it is possible to obtain linear relationships between experimental ∆G° and computationally calculated HINT score values. Table 2 reports the HINT score protein-ligand values calculated for two different homogenous set, formed, respectively, by eight bovine trypsin-ligand complexes and by nine tryptophan synthase-ligand complexes, for which experimental inhibition constants are reported in literature. Table 2 PDB code protein ∆G°binding (kcal/mol) Hint score 1TNJ bovine trypsin –2.66 677 1TNK bovine trypsin –2.02 720 1TNI bovine trypsin –2.30 834 1TNL bovine trypsin –2.54 1360 1TNG bovine trypsin –3.98 923 1TNH bovine trypsin –4.57 972 3PTB bovine trypsin –6.43 1634 1PPH bovine trypsin –8.04 2663 1CX9 tryptophan syntethase –9.58 2595 1C29 tryptophan syntethase –9.00 2793 1C9D tryptophan syntethase –8.97 3094 1CW2 tryptophan syntethase –8.76 3094 1C8V tryptophan syntethase –8.92 2571 2TRS tryptophan syntethase –7.20 2646 1QOP tryptophan syntethase –7.20 2721 1A50 tryptophan syntethase –8.56 2914 2TSY tryptophan syntethase –4.65 905 version date: 1 December 2006
version date: 1 December 2006 10 2000 4000 Hint Score Fig. 2 Plots of experimental AG vS. HINT score units for bovine trypsin(cyan triangle)and tryptophane synthase(green triangle) The regression analyses of bovine trypsin and tryptophan synthase data series are shown in Fig. 2 and, respectively, represented by the following equations AG°=-0.0019HSpL-3.1210 △G°=-00028HSpL-0.5880 with R=0.83,(standard error) SE=0.90 kcal/mol for trypsin-ligand complexes andR=0.87 and SE=1.16 kcal/mol for tryptophan synthase-ligand complexes Thus, it is possible to predict the binding free energy of new hypothetical trypsin or tryptophan synthase ligands, for which the experimental inhibition constant value has not been yet determined just calculating the hinT score value for the new potential complex, as shown in Fig 3
9 Fig. 2 Plots of experimental ∆G° vs. HINT score units for bovine trypsin (cyan triangle) and tryptophane synthase (green triangle). The regression analyses of bovine trypsin and tryptophan synthase data series are shown in Fig. 2 and, respectively, represented by the following equations: ∆G° = –0.0019 HSP–L –3.1210 ∆G° = –0.0028 HSP–L –0.5880 with R = 0.83, (standard error) SE = 0.90 kcal/mol for trypsin-ligand complexes and R = 0.87 and SE = 1.16 kcal/mol for tryptophan synthase-ligand complexes. Thus, it is possible to predict the binding free energy of new hypothetical trypsin or tryptophan synthase ligands, for which the experimental inhibition constant value has not been yet determined, just calculating the HINT score value for the new potential complex, as shown in Fig. 3. Hint Score 0 1000 2000 3000 4000 ∆G° (kcal/mol) -2 -4 -6 -8 -10 version date: 1 December 2006
version date: 1 December 2006 Predicted △ binding free energy 3000 4000 Hint Score Fig 3 Prediction of the binding free energy of a new potential bovine typsin ligand, from the protein-ligand HINT score It is more difficult to find a good relationship between experimental and computational data, for a heterogeneous set of protein-ligand complexes, characterized by different active site polarity, igands with diverse chemical nature, and inhibition constants varying among 10 or more orderd of magnitude [13]. In the following analysis, 93 different crystallographic protein-ligand complexes were examined and scored, in order to define a general relationship between AG and Hint score Experimental and calculated data, with both the protein nature and the crystallographic resolution values, are reported in Table 3, while the general relation is shown in Fig. 4 Table 3 PDB code Protein Crystal resolution(A) AG binding (kcal/ mol) Hint score IETT bovine thrombin 2.50 2131 IETR bovine thromb IUVT 1A2C human thrombin 8.97 3019 LOw human human thrombi ICu 1C4V human thrombin 10 human thrombin 2.07 IKTT human thrombin 3586 IOYT humaww Upac. org/publications/cd/medicinal_chemistiy9s
10 Fig. 3 Prediction of the binding free energy of a new potential bovine typsin ligand, from the protein–ligand HINT score It is more difficult to find a good relationship between experimental and computational data, for a heterogeneous set of protein–ligand complexes, characterized by different active site polarity, ligands with diverse chemical nature, and inhibition constants varying among 10 or more orderd of magnitude [13]. In the following analysis, 93 different crystallographic protein–ligand complexes were examined and scored, in order to define a general relationship between ∆G° and HINT score. Experimental and calculated data, with both the protein nature and the crystallographic resolution values, are reported in Table 3, while the general relation is shown in Fig. 4. Table 3 PDB code Protein Crystal resolution (Å) ∆G°binding (kcal/mol) Hint score 1ETS bovine thrombin 2.30 –11.17 3623 1ETT bovine thrombin 2.50 –8.00 2131 1ETR bovine thrombin 2.20 –10.49 2848 1UVT bovine thrombin 2.50 –10.38 1834 1A2C human thrombin 2.10 –8.97 3019 1A4W human thrombin 1.80 –8.05 3110 1BHX human thrombin 2.30 –9.30 2283 1D6W human thrombin 2.00 –8.10 4005 1FPC human thrombin 2.30 –9.52 2299 1C4U human thrombin 2.10 –14.09 3882 1C4V human thrombin 2.10 –14.67 4390 1C5N human thrombin 1.50 –6.39 2334 1C50 human thrombin 1.90 –4.75 2498 1D4P human thrombin 2.07 –8.57 3363 1KTT human thrombin 2.10 –8.33 3586 1OYT human thrombin 1.67 –9.85 3660 Hint Score 0 1000 2000 3000 4000 ∆G° (kcal/mol) -2 -4 -6 -8 -10 Predicted binding free energy new ligand version date: 1 December 2006