正在加载图片...
C.F.Med Chem.(2015)4-40 90 and e spre of NP and ND relative to synthetic drug-like ity found in these mo e data indicate that the structural and ph measured by Fsp has beer ugh s and s drugs hav mila ent loadings in the PCA can be used to understand the dr uct-b ues for ad 4)u greater xity than c rotate The d ng the pla other siz ba aran such a ant(RotB),and stereocer Stereo)have a stror ng nes )indicating that out with th rs is illus ated on the load plo may er of relative to s alon s largely due ce in mole products have .on average nd molecula weight ient AioghaedtriDutionoiS”andsdugssconstrainedalor of 5 drugs may g cla argely b d)ane wh es:Io nds r )he PC2.ande pos struc ural an athardalongPc Des for mol complexit hod for ensio al da ma loss of info ion from the original Compared to the natural product-based NP.ND.and S drug dan of rugs gs ind var for th h of NP and ND by ALOGPs and L bina riginal ugs ext to th wer let de idean the fraction of total varian oD)and gre mole cular (nStMW and Fsp) S drugs (F the se mponent (PC)contains the of the forme ents To whe n)with minima al loss of infor he las 30 years,at es of the tructura for 20 s and phy emical )with PCAn ron 198 aken to ther the first two p h displays a noticeable e in ase for all NCEs fr the full 20-dimen w to the aortoe tion of the PCA plot heminfor tthe molecular weighto e be ause the signs and units of each cular weight,such as RD (No)normalized values for stereocenter count (nStMW) for NP and ND drugs were 2- to 6-fold higher than those for S⁄ and S drugs (Table 2). These data are consistent with previous cheminformatic studies indicating that natural products have a greater degree of stereochemical diversity relative to synthetic drug-like compounds.34,35 The values of Fsp3 are higher for NP and ND drugs relative to S⁄ and S drugs. This is particularly important because increased molecular complexity, as measured by Fsp3 , has been associated with the ability of molecules to interrogate larger regions of chem￾ical space.37 Interestingly, although S⁄ and S drugs have similar average molecular weights, S⁄ drugs have higher values for both nStMW and Fsp3 . Thus, natural product-based S⁄ drugs exhibit greater molecular complexity than completely synthetic S drugs. Overall, ring count (Rings), ring system count (RngSys), and rings per ring system (RRSys) are similar across compound classes. Mean values for the size of the largest ring (RngLg) suggest that, on average, NP drugs contain larger rings than S drugs (Table 2). However, the median value for largest ring size is equivalent (6 atoms) for all compound classes (Table S1), indicating that outliers may skew the mean value for NP drugs. The average and median number of aromatic rings is higher for S and S⁄ drugs relative to NP and ND drugs. These data are consistent with previous analyses indicating that natural products have lower aromatic character than synthetic, drug-like compounds.34 Finally, the partition coefficient ALOGPs and distribution coeffi- cient LogD both predict NP and S drugs to have the lowest and highest hydrophobicity, respectively, with ND and S⁄ drugs having intermediate values. The increased lipophilicity of S drugs may result in part from higher aromatic content. Calculated aqueous solubility ALOGpS is similar across drug classes. Principal component analysis comparison of compound classes: To visualize the distribution of NCEs in chemical space, we performed principal component analysis (PCA) on the set of structural and physicochemical descriptors described above. PCA is a statistical method for variable reduction that allows multidimensional data to be visualized using two- and three-dimensional plots with min￾imal loss of information from the original dataset. As several of the descriptors in this analysis are correlated, PCA uses a linear trans￾formation to rotate the matrix of variables onto a set of orthonor￾mal axes that define the dimensions of greatest variance for the dataset.42–44 The newly formed axes are called principal compo￾nents and represent linear combinations of the original variables (descriptors). Importantly, the matrix rotation preserves Euclidean distances and maximizes the fraction of total variance from the original dataset on each successive principal component. Through this transformation, the first principal component (PC1) retains the greatest fraction of variance from the original dataset, the second principal component (PC2) contains the next largest fraction, and so on. In this way, an n-dimensional dataset can be visualized using an m-dimensional plot of principal components (where m << n) with minimal loss of information.42–44 In the current analysis, drugs were evaluated for 20 structural and physicochemical properties (Table 1), with PCA resulting in rotation of the complete 20-dimensional dataset onto a set of prin￾cipal components.23 Taken together, the first two principal compo￾nents (PC1, PC2) in this analysis retain 64% of the information in the full 20-dimensional dataset (Table S3), whereas >90% of the information in the full dataset is represented in the first six princi￾pal components (PC1–PC6; Table S3). The PCA plot (PC1 vs PC2) from a single analysis encompassing all compounds is presented in Figure 3, although NP, ND, S⁄ and S drugs are shown on separate plots for clarity. To maintain the orientation of these PCA plots with our previous analyses,23–28 PC2 scores for each compound were inverted; this is feasible because the signs and units of each principal component are arbitrary. The PCA plots indicate that NP (Fig. 3a) and ND drugs (Fig. 3b) are fairly evenly distributed across chemical space as defined by PC1 and PC2. The wide spread of NP and ND drugs on the PCA plots illustrates the high degree of physicochemical and structural diver￾sity found in these molecules. Both S⁄ (Fig. 3c) and S drugs (Fig. 3d) occupy tighter clusters in chemical space relative to NP and ND drugs. These data indicate that the structural and physicochemical features of synthetic drugs are more narrowly focused than natural products, and consequently these compounds exhibit less chemical diversity. Component loadings in the PCA can be used to understand the influence of the original 20 parameters on the distribution of mole￾cules in the PCA plots. A loading plot (Fig. 4) illustrates how the original variables are rotated onto the plane defined by PC1 and PC2. The loading plot reveals that molecular weight (MW) and other size-based parameters such as heteroatom counts (N, O), hydrogen bond donor/acceptor count (HBD, HBA), rotatable bond count (RotB), and stereocenter count (nStereo) have a strong neg￾ative (leftward) influence along PC1. The high correlation of molec￾ular weight with these parameters is illustrated on the loading plot by the small angles between the vectors representing each descrip￾tor. This indicates that the large spread of NP and ND drugs along PC1, relative to S⁄ and S drugs, is largely due to variance in molec￾ular size (Fig. 3). These data agree with previous analyses showing that natural products have, on average, higher molecular weights relative to synthetic drug-like compounds.34,35 Although the distribution of S⁄ and S drugs is constrained along PC1, the spread of these compounds is more pronounced on PC2 (Fig. 3c and d). Positioning of compounds along PC2 is governed largely by ALOGPs and ALOGpS, which influence compounds in a positive (upward) and negative (downward) direction, respectively (Fig. 4). In addition, RngAr, Rings and RngSys influence the posi￾tioning of compounds positively (upward) along PC2, and nega￾tively (leftward) along PC1. Descriptors for molecular complexity Fsp3 and nStMW, as well as relPSA, influence the positioning of compounds negatively (downward) along PC2 and negatively (left￾ward) along PC1 (Fig. 4). Compared to the natural product-based NP, ND, and S⁄ drugs, a larger portion of completely synthetic S drugs cluster in the upper right region of the PCA plot (Fig. 3). The component loadings indi￾cate that this results from the increased hydrophobic character of S drugs, as measured by ALOGPs and LogD. In contrast, a greater pro￾portion of NP and ND drugs extend into the lower left region of the PCA plot (Fig. 3), resulting from lower hydrophobicity (ALOGPs, LogD) and greater molecular complexity (nStMW and Fsp3 ). Interestingly, natural product-based S⁄ drugs cluster lower on PC2 than completely synthetic S drugs (Fig. 3c and d), due to the decreased hydrophobicity (ALOGPs) and increased stereochemical diversity (nStMW and Fsp3 ) of the former. Time-resolved analysis of structural and physicochemical descrip￾tors and PCA plots: To investigate relative changes in the properties of drugs over the last 30 years, average values of the 20 structural and physicochemical parameters for NP, ND, S⁄ , and S drugs were parsed in five-year periods from 1981 to 2010 (Table S4). Although distinct trends are less clear in these data, molecular weight displays a noticeable increase for all NCEs from 1981 to 2010. A dramatic increase in molecular weight for NP drugs in 2001–2005 is in part due to the approval of several large pep￾tide-based drugs, which skew the mean value. The influence of high molecular weight outliers is less pronounced on median val￾ues, though the pattern of increasing molecular weight is still observed (Table S5). These results are consistent with previous cheminformatic studies indicating that the molecular weight of drugs has increased since the early 1980s.45,46 Parameters that cor￾relate with molecular weight, such as heteroatom counts (N, O), hydrogen bond donor/acceptor count (HBD, HBA), rotatable bond C. F. Stratton et al. / Bioorg. Med. Chem. Lett. 25 (2015) 4802–4807 4805
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有