HHS Public Access Author manuscript Nature. Author manuscript; available in PMC 2018 February 28 Published in final edited form as Nature.2017 September07,549(767048-53.doi:10.1038/ nature23874 Commensal bacteria produce GPCR ligands that mimic human signaling molecules Louis J. Cohen, 2, Daria Esterhazy, Seong-Hwan Kim, Christophe Lemetre', Rhiannon R. Aguilar', Emma A Gordon1, Amanda J. Pickard, Justin R Cross, Ana B. Emiliano4, Sun M. Han, John Chu, Xavier Vila-Farres', Jeremy Kaplitt, Aneta Rogoz, Paula Y. Calle Craig Hunter, J Kipchirchir Bitok', and Sean F. Brady Laboratory of Genetically Encoded Small Molecules, Rockefeller University Division of Gastroenterology, Department of Medicine, Icahn School of Medicine at Mount Sinai lAboratory of Mucosal Immunology, Rockefeller University 4Laboratory of Molecular Genetics, Rockefeller University cOmparative Biosciences Center, Rockefeller University 6Donald B. and Catherine C Marron Cancer Metabolism Center, Memorial Sloan Kettering Cancer Center Summary Statement Commensal bacteria are believed to play important roles in human health. The mechanisms by which they affect mammalian physiology are poorly understood; however, bacterial metabolites are likely to be key components of host interactions. Here, we use bioinformatics and synthetic biology to mine the human microbiota for N-acyl amides that interact with G-protein-coupled receptors(GPCRs). We found that N-acyl amide synthase genes are enriched in gastrointestinal bacteria and the lipids they encode interact with GPCRs that regulate gastrointestinal tract physiology. Mouse and cell-based models demonstrate that commensal GPR119 agonists regulate metabolic hormones and glucose homeostasis as efficiently as human ligands al though future studies are needed to define their potential physiologic role in humans. This work suggests that chemical mimicry of eukaryotic signaling molecules may be common among commensal bacteria and download text and data.mine the content in such documents for the academic research, subjectalwaystothefullConditionsofusehttp://www.nature.com/authors/editorialpolicies/license.htmlterms Correspondence and requests for materials should be addressed to s.F. B. (sbrady(@rockefeller. edu). Contact: Laboratory of Genetically Encoded Small Molecules, The Rockefeller University, 1230 York Avenue, New York, New York, 10065 Author Contributions: L.J.C. and S F B designed research, L.J.C. assisted with all experiments;, S.H. K. assisted with molecule characterization, E.A.G P.Y. C., J.K. B, and R.R.A. assisted with gene cloning:, D E, A B E, S M.H C H. and A.R. assisted with mouse experiments; J.C. X V-F, J K. assisted with molecule synthesis; A.J.P. and J. R C assisted with metabolite analysis in human/mouse samples; L.J.C. and C L analyzed data, LJ. C. and S.F.B. wrote the paper methods, figures and tables related to the structural determination of compound Competing Financial Interest Statement The authors of this study have no competing financial interests to declare
Commensal bacteria produce GPCR ligands that mimic human signaling molecules Louis J. Cohen1,2, Daria Esterhazy3, Seong-Hwan Kim1, Christophe Lemetre1, Rhiannon R. Aguilar1, Emma A. Gordon1, Amanda J. Pickard6, Justin R. Cross6, Ana B. Emiliano4, Sun M. Han1, John Chu1, Xavier Vila-Farres1, Jeremy Kaplitt1, Aneta Rogoz3, Paula Y. Calle1, Craig Hunter5, J. Kipchirchir Bitok1, and Sean F. Brady1 1Laboratory of Genetically Encoded Small Molecules, Rockefeller University 2Division of Gastroenterology, Department of Medicine, Icahn School of Medicine at Mount Sinai 3Laboratory of Mucosal Immunology, Rockefeller University 4Laboratory of Molecular Genetics, Rockefeller University 5Comparative Biosciences Center, Rockefeller University 6Donald B. and Catherine C. Marron Cancer Metabolism Center, Memorial Sloan Kettering Cancer Center Summary Statement Commensal bacteria are believed to play important roles in human health. The mechanisms by which they affect mammalian physiology are poorly understood; however, bacterial metabolites are likely to be key components of host interactions. Here, we use bioinformatics and synthetic biology to mine the human microbiota for N-acyl amides that interact with G-protein-coupled receptors (GPCRs). We found that N-acyl amide synthase genes are enriched in gastrointestinal bacteria and the lipids they encode interact with GPCRs that regulate gastrointestinal tract physiology. Mouse and cell-based models demonstrate that commensal GPR119 agonists regulate metabolic hormones and glucose homeostasis as efficiently as human ligands although future studies are needed to define their potential physiologic role in humans. This work suggests that chemical mimicry of eukaryotic signaling molecules may be common among commensal bacteria Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms Correspondence and requests for materials should be addressed to S.F.B. (sbrady@rockefeller.edu). Contact: Laboratory of Genetically Encoded Small Molecules, The Rockefeller University, 1230 York Avenue, New York, New York, 10065. Author Contributions: L.J.C. and S.F.B. designed research; L.J.C. assisted with all experiments; S-H.K. assisted with molecule characterization; E.A.G., P.Y.C., J.K.B., and R.R.A. assisted with gene cloning; D.E., A.B.E., S.M.H., C.H. and A.R. assisted with mouse experiments; J.C., X.V-F., J.K. assisted with molecule synthesis; A.J.P. and J.R.C. assisted with metabolite analysis in human/mouse samples; L.J.C. and C.L. analyzed data; L.J.C. and S.F.B. wrote the paper Supplementary Information Extended data figures and tables are provided to accompany the main text and methods. Supplementary information contains the methods, figures and tables related to the structural determination of compounds. Competing Financial Interest Statement The authors of this study have no competing financial interests to declare. HHS Public Access Author manuscript Nature. Author manuscript; available in PMC 2018 February 28. Published in final edited form as: Nature. 2017 September 07; 549(7670): 48–53. doi:10.1038/nature23874. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
and that manipulation of microbiota genes encoding metabolites that elicit host cellular responses represents a new small molecule therapeutic modality (microbiome-biosynthetic-gene-therapy) Keywords GPCR; microbiome; metagenome; signaling: N-acyl amide Although the human microbiome is believed to play an important role in human physiology Bacteria rely heavily on small molecules to interact with their environment. While it is likely that the human microbiota similarly relies on small molecules to interact with its human host, the identity and functions of microbiota-encoded effector molecules are largely unknown. The study of small molecules produced by the human microbiota and the identification of the host receptors they interact with should help to define the relationship between bacteria and human physiology and provide a resource for the discovery of small We recently reported on the discovery of commendamide, a human microbiota encoded, G protein-coupled receptor(GPCR) active, long-chain N-acyl amide that suggests a structural onvergence between human signaling molecules and microbiota encoded metabolites. N- acyl amides, like the endocannabinoids, are able to regulate diverse cellular functions due part, to their ability to interact with GPCRs. GPCRs are the largest family of membrane receptors in eukaryotes and are likely to be key mediators of host-microbial interactions the human microbiome. The importance of GPCRs to human physiology is reflected by the fact that they are the most common targets of therapeutically approved small molecule drugs. The GPCRs with which human N-acyl amides interact are implicated in diseases including diabetes, obesity, cancer, and inflammatory bowel disease among others. 4, With numerous possible combinations of amine head groups and acyl tails, long-chain N-acyl amides represent a potentially large and functionally diverse class of microbiota-encoded GPCR-active signaling molecules Here, we combined bioinformatic analysis of human microbiome sequencing data with targeted gene synthesis, heterologous expression and high-throughput GPCR activity screening to identify GPCR-active N-acyl amides encoded by the human microbiota. The human microbiome and suggest these GPCR-active small molecules and their associate bacterial effectors we identified provide mechanistic insights into potential functions of the microbial biosynthetic genes have the potential to regulate human physiology Isolation of commensal N-acyl amides To identify N-acyl synthase(NAS)genes within human microbial genomes, the Human Microbiome Project(HMP)sequence data was searched with BLASTN using 689 NAS genes associated with the N-acyl synthase protein family PFAM134443 The 143 unique human microbial N-acyl synthase genes(hm-NASs)we identified fall into four major clades Nature. Author manuscript; available in PMC 2018 February 28
and that manipulation of microbiota genes encoding metabolites that elicit host cellular responses represents a new small molecule therapeutic modality (microbiome-biosynthetic-gene-therapy). Keywords GPCR; microbiome; metagenome; signaling; N-acyl amide Introduction Although the human microbiome is believed to play an important role in human physiology the mechanisms by which bacteria affect mammalian physiology remain poorly defined.1 Bacteria rely heavily on small molecules to interact with their environment.2 While it is likely that the human microbiota similarly relies on small molecules to interact with its human host, the identity and functions of microbiota-encoded effector molecules are largely unknown. The study of small molecules produced by the human microbiota and the identification of the host receptors they interact with should help to define the relationship between bacteria and human physiology and provide a resource for the discovery of small molecule therapeutics. We recently reported on the discovery of commendamide, a human microbiota encoded, G protein-coupled receptor (GPCR) active, long-chain N-acyl amide that suggests a structural convergence between human signaling molecules and microbiota encoded metabolites.3 Nacyl amides, like the endocannabinoids, are able to regulate diverse cellular functions due, in part, to their ability to interact with GPCRs. GPCRs are the largest family of membrane receptors in eukaryotes and are likely to be key mediators of host-microbial interactions in the human microbiome. The importance of GPCRs to human physiology is reflected by the fact that they are the most common targets of therapeutically approved small molecule drugs. The GPCRs with which human N-acyl amides interact are implicated in diseases including diabetes, obesity, cancer, and inflammatory bowel disease among others.4,5 With numerous possible combinations of amine head groups and acyl tails, long-chain N-acyl amides represent a potentially large and functionally diverse class of microbiota-encoded GPCR-active signaling molecules. Here, we combined bioinformatic analysis of human microbiome sequencing data with targeted gene synthesis, heterologous expression and high-throughput GPCR activity screening to identify GPCR-active N-acyl amides encoded by the human microbiota. The bacterial effectors we identified provide mechanistic insights into potential functions of the human microbiome and suggest these GPCR-active small molecules and their associated microbial biosynthetic genes have the potential to regulate human physiology. Isolation of commensal N-acyl amides To identify N-acyl synthase (NAS) genes within human microbial genomes, the Human Microbiome Project (HMP) sequence data was searched with BLASTN using 689 NAS genes associated with the N-acyl synthase protein family PFAM13444.3 The 143 unique human microbial N-acyl synthase genes (hm-NASs) we identified fall into four major clades Cohen et al. Page 2 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al (clades A-D, Fig. la) that are divided into a number of distinct sub-clades( Fig. la). Forty four phylogenetically diverse hm-NAS genes were selected for synthesis and heterologous expression. This set included all hm-NAS genes from clades sparsely populated with hm- NAS sequences and representative examples from clades heavily populated with hm-NAS sequences(Fig. la) Liquid chromatography-mass spectrometry(LCMS)analysis of ethyl acetate extracts derived from E. coli cultures transformed with each construct revealed clone specific peaks in 31 cultures. hm-NAS gene functions could be clustered into 6 groups based on the retention time and mass of the heterologously produced metabolites(Extended Data Fig. I and Supplementary Table 1) Molecule isolation and structural elucidation studies were carried out on one representative culture from each group(Supplementary Information) This analysis identified six N-acyl amide families that differ by amine head group and fatty acid tail(Fig. Ib, families 1-6): 1)N-acyl glycine, 2)N-acyloxyacyl lysine, 3)N acyloxyacyl glutamine, 4)N-acyl lysine/ornithine, 5)N-acyl alanine, 6) N-acyl serinol. Each family was isolated as a collection of metabolites with different acyl substituents. The most common analog within each family is shown in Figure 1b. Long-chain N-acyl ornithine, lysines and glutamines have been reported as natural products produced by soil bacteria and some human pathogens.6, 7, 8 Functional differences in NAS enzymes follow the pattern of the nas phylogenetic tree, with hm-NAS genes from the same clade or sub-clade largely encoding the same metabolite family(Fig. la). with the exception of one nAs that is predicted to use lysine and ornithine as substrates, hm-NASs appear to be selective for a single amine-containing substrate. The most common acyl chains incorporated by hm-NASs are from 14-18 carbons in length These can be modified by B-hydroxy lation or a single unsaturation. Three hm-NAS enzymes ontain two domains. The second domain is either an aminotransferase that is predicted to alyze the formation of serinol from glycerol (Fig. Ib, family 6, Extended Data Fig. 2)or an additional acy transferase that is predicted to catalyze the transfer of a second acyl grot Fig. Ib, families 2, 3). To explore NAS gene synteny we looked for gene occurrence patterns around NAS genes in the human microbiome. The only repeating pattern that we saw was that some nAs genes appear adjacent to genes predicted to encode acyltransferases This is reminiscent of the two domain NASs that we found produce di-acyl lipids(families 2 and 3). There were rare instances where NASs potentially occur in gene clusters, but none of these were used in this study To look for native N-acyl amide production by commensal bacteria, organic extracts from ultures of species containing the hm-NAS genes we examined were screened by LCms Based on retention time and mass we detected the production of the expected N-acyl amides by commensal species predicted to produce N-acyl glycines, N-acyloxyacyl lysines, N-acyl lysine/ornithine and N-acyl serinols. The only case where we did not detect the expected N- acyl amide was for N-acyloxyacyl glutamines(Extended Data Fig. 1) Nature. Author manuscript; available in PMC 2018 February 28
(clades A–D, Fig. 1a) that are divided into a number of distinct sub-clades (Fig. 1a). Fortyfour phylogenetically diverse hm-NAS genes were selected for synthesis and heterologous expression. This set included all hm-NAS genes from clades sparsely populated with hmNAS sequences and representative examples from clades heavily populated with hm-NAS sequences (Fig. 1a). Liquid chromatography-mass spectrometry (LCMS) analysis of ethyl acetate extracts derived from E. coli cultures transformed with each construct revealed clone specific peaks in 31 cultures. hm-NAS gene functions could be clustered into 6 groups based on the retention time and mass of the heterologously produced metabolites (Extended Data Fig. 1 and Supplementary Table 1). Molecule isolation and structural elucidation studies were carried out on one representative culture from each group (Supplementary Information). This analysis identified six N-acyl amide families that differ by amine head group and fatty acid tail (Fig. 1b, families 1–6): 1) N-acyl glycine, 2) N-acyloxyacyl lysine, 3) Nacyloxyacyl glutamine, 4) N-acyl lysine/ornithine, 5) N-acyl alanine, 6) N-acyl serinol. Each family was isolated as a collection of metabolites with different acyl substituents. The most common analog within each family is shown in Figure 1b. Long-chain N-acyl ornithines, lysines and glutamines have been reported as natural products produced by soil bacteria and some human pathogens.6,7,8 Functional differences in NAS enzymes follow the pattern of the NAS phylogenetic tree, with hm-NAS genes from the same clade or sub-clade largely encoding the same metabolite family (Fig. 1a). With the exception of one NAS that is predicted to use lysine and ornithine as substrates, hm-NASs appear to be selective for a single amine-containing substrate. The most common acyl chains incorporated by hm-NASs are from 14–18 carbons in length. These can be modified by β-hydroxylation or a single unsaturation. Three hm-NAS enzymes contain two domains. The second domain is either an aminotransferase that is predicted to catalyze the formation of serinol from glycerol (Fig. 1b, family 6, Extended Data Fig. 2) or an additional acyltransferase that is predicted to catalyze the transfer of a second acyl group (Fig. 1b, families 2, 3). To explore NAS gene synteny we looked for gene occurrence patterns around NAS genes in the human microbiome. The only repeating pattern that we saw was that some NAS genes appear adjacent to genes predicted to encode acyltransferases. This is reminiscent of the two domain NASs that we found produce di-acyl lipids (families 2 and 3). There were rare instances where NASs potentially occur in gene clusters, but none of these were used in this study. To look for native N-acyl amide production by commensal bacteria, organic extracts from cultures of species containing the hm-NAS genes we examined were screened by LCMS. Based on retention time and mass we detected the production of the expected N-acyl amides by commensal species predicted to produce N-acyl glycines, N-acyloxyacyl lysines, N-acyl lysine/ornithines and N-acyl serinols. The only case where we did not detect the expected Nacyl amide was for N-acyloxyacyl glutamines (Extended Data Fig. 1). Cohen et al. Page 3 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al hm-NAs genes are enriched in gl bacteria A BLASTN search of NAs genes against human microbial reference genomes and metagenomic sequence data from the HMP revealed that NAS genes are enriched gastrointestinal(Gi) bacteria relative to bacteria found at other body sites( Fischers exact test p<0.05, gastrointestinal versus non gastrointestinal sites, Supplementary Table 2, Figure 1). Within gastrointestinal sites that were frequently sampled in the context of the HMP(e.g, stool, buccal mucosa, supragingival plaque, tongue)hm-NAS gene families show distinct distribution patterns(Fig. Ic, two way ANOVA p<2e-16). Despite tremendous person-to-person variation in microbiota species composition, most N-acyl amide synthase gene families we studied can be found in over 90% of patient samples. N-acyoxyacyl glutamine(12%)and N-acyl alanine(not detected) synthase genes are the only exceptions Taken together, these data suggest that NAs genes are highly prevalent in the human microbiome and unique sites within the gastrointestinal tract are likely exposed to different sets of N-acyl amide structures When we searched existing metatranscriptome sequence data from stool and supragingival plaque microbiomes to look for evidence of hm-NAS gene expression in the gastrointestinal tract we observed site-specific hm-NAS gene expression that matches the predicted body si localization patterns for hm-NAS genes in metagenomic data. Across patient samples hm NAS genes are transcribed to varying degrees relative to the average level of transcription for each gene in the bacterial genome(Fig. 2a). In the stool metatranscriptome dataset both RNA and DNa sequencing datasets were available allowing for a more direct sample-to- sample comparison of hm-NAS gene expression levels. When metatranscriptome data were normalized using the number of hm-NAS gene specific DNA sequence reads detected in each sample, we observed what appears to be differential expression of hm-NAS genes in different patient samples(Fig. 2b ). Datasets whereby bacterial genes, transcripts and metabolites can be tracked in a single sample will be necessary to explore how hm-NAS gene transcription variation impacts metabolite production. hm-N-acyl-amides interact with gl GPCRs The major N-acyl amide isolated from each family was assayed for agonist and antagonist activity against 240 human GPCRs(Fig 3 and Extended Data Fig 3). The strongest agonist interactions were: activation of GPR119 by N-palmitoyl serinol(EC509 HM), activation of phingosine-l-phosphate receptor 4(SIPR4) by N-3-hydroxypalmitoyl ornithine(EC50 32 uM) and activation of G2a by N-myristoyl alanineEC50 3 uM). Interactions between bacterial N-acyl amides and GPCRs were also specific(Fig 3a and b). In each survey experiment, no other GPCRs reproducibly showed greater than 35% activation relative to the endogenous ligands. The strongest antagonist activities were observed for N-acyloxyacyl glutamine against two prostaglandin receptors, PTGIR and PTGER4(Fig 3c, PTGIR IC50 15 AM, PIGER4 IC50 43 uM). PTGiR was specifically antagonized by N-acyloxyacyl glutamine, while PTGER4 was antagonized by N-acyloxyacyl glutamine as well as other N- acyl amides [Fig 3c(i)and 3c(ii]. Alternative GPCR screening methods could identify interactions in addition to those uncovered here Nature. Author manuscript; available in PMC 2018 February 28
hm-NAS genes are enriched in GI bacteria A BLASTN search of NAS genes against human microbial reference genomes and metagenomic sequence data from the HMP revealed that NAS genes are enriched in gastrointestinal (GI) bacteria relative to bacteria found at other body sites (Fischer’s exact test p < 0.05, gastrointestinal versus non gastrointestinal sites, Supplementary Table 2, Figure 1). Within gastrointestinal sites that were frequently sampled in the context of the HMP (e.g., stool, buccal mucosa, supragingival plaque, tongue) hm-NAS gene families show distinct distribution patterns (Fig. 1c, two way ANOVA p < 2e-16). Despite tremendous person-to-person variation in microbiota species composition, most N-acyl amide synthase gene families we studied can be found in over 90% of patient samples. N-acyoxyacyl glutamine (12%) and N-acyl alanine (not detected) synthase genes are the only exceptions. Taken together, these data suggest that NAS genes are highly prevalent in the human microbiome and unique sites within the gastrointestinal tract are likely exposed to different sets of N-acyl amide structures. When we searched existing metatranscriptome sequence data from stool and supragingival plaque microbiomes to look for evidence of hm-NAS gene expression in the gastrointestinal tract we observed site-specific hm-NAS gene expression that matches the predicted body site localization patterns for hm-NAS genes in metagenomic data. Across patient samples hmNAS genes are transcribed to varying degrees relative to the average level of transcription for each gene in the bacterial genome (Fig. 2a). In the stool metatranscriptome dataset both RNA and DNA sequencing datasets were available allowing for a more direct sample-tosample comparison of hm-NAS gene expression levels. When metatranscriptome data were normalized using the number of hm-NAS gene specific DNA sequence reads detected in each sample, we observed what appears to be differential expression of hm-NAS genes in different patient samples (Fig. 2b). Datasets whereby bacterial genes, transcripts and metabolites can be tracked in a single sample will be necessary to explore how hm-NAS gene transcription variation impacts metabolite production. hm-N-acyl-amides interact with GI GPCRs The major N-acyl amide isolated from each family was assayed for agonist and antagonist activity against 240 human GPCRs (Fig. 3 and Extended Data Fig. 3). The strongest agonist interactions were: activation of GPR119 by N-palmitoyl serinol (EC50 9 µM), activation of sphingosine-1-phosphate receptor 4 (S1PR4) by N-3-hydroxypalmitoyl ornithine (EC50 32 µM) and activation of G2A by N-myristoyl alanine (EC50 3 µM). Interactions between bacterial N-acyl amides and GPCRs were also specific (Fig. 3a and b). In each survey experiment, no other GPCRs reproducibly showed greater than 35% activation relative to the endogenous ligands. The strongest antagonist activities were observed for N-acyloxyacyl glutamine against two prostaglandin receptors, PTGIR and PTGER4 (Fig. 3c, PTGIR IC50 15 µM, PTGER4 IC50 43 µM). PTGIR was specifically antagonized by N-acyloxyacyl glutamine, while PTGER4 was antagonized by N-acyloxyacyl glutamine as well as other Nacyl amides [Fig. 3c(i) and 3c(ii)]. Alternative GPCR screening methods could identify interactions in addition to those uncovered here. Cohen et al. Page 4 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al Based on data from the Human Protein Atlas(HPA) GPCRs targeted by human microbial acyl amides are localized to the gastrointestinal tract and its associated immune cells. In mouse models, this collection of gastrointestinal tract localized GPCRs have been reported to affect diverse mucosal functions including metabolism(GPRI19), immune cell differentiation (SIPR4, PTGIR, PTGER4), immune cell trafficking(SIPR4, G2A)and tissue repair(Ptgir) It is not possible at this time to look for co-localization of GPCR and hm-NAS gene expression in specific gastrointestinal niches, as neither the HMP nor the HPa are sufficiently comprehensive in their survey of human body sites. Nonetheless, 16S and metagenomic deep sequencing studies link bacteria containing hm-NAS genes or hm- NAS genes themselves to specific locations in the gastrointestinal tract where GPCRs of nterest are expressed (Extended Data Fig 4) Bacterial and human ligands share structure and function Human microbiota-encoded N-acyl amides bear structural similarity to endogenous GPCr- active ligands(Fig. 4). The clearest overlap in structure and function between bacterial and human GPCR-active ligands is for the endocannabinoid receptor GPR119(Fig. 4 and 5) Endogenous GPRI19 ligands include oleoylethanolamide(OEA)and the dietary lipid derivative 2-oleoyl glycerol(2-0G) 15, 16 In our heterologous expression experiment we isolated both the palmitoyl and oleoyl analogs of N-acyl serinol. The latter only differs from 2-OG by the presence of an amide instead of an ester and from Oea by the presence of an dditional ethanol substituent. N-oleoyl serinol is a similarly potent GPR119 agonist compared to the endogenous ligand OEa(EC50 12 AM vS. 7 uM) but elicits almost a 2-fold greater maximum GPRI19 activation(Fig 5a). N-palmitoyl derivatives of all 20 natural amino acids were synthesized and none activated GPri19 by more than 37% relative to OEA (Fig. 5b). The generation of a potent and specific long-chain N-acyl-based GPRI ligand therefore necessitates a more complex biosynthesis than the simple N-acylation of an amino acid as is commonly seen for characterized NAs enzymes. In this case, the biosynthesis of N-acyl serinols is achieved through the coupling of an NAs domain with an aminotransferase that is predicted to generate serinol from glycerol(Extended Data Fig. 2 The endogenous agonist for SIPR4, sphingosine-l-phosphate(SIP)and the M-3- hydroxypalmitoyl ornithine/lysine family of bacterial agonists share similar head group charges. SIP is a significantly more potent agonist(EC500.09 HM vS EC50 32 uM) however, the bacterial agonists are more specific for SIPR4. The bacterial N-3- hydroxypalmitoyl ornithine did not activate SIPRl, 2, or 3 in our GPCR screen, whereas SIP activates all four members of the SIP receptor family tested No direct comparison could be made between the microbiota-derived and endogenous ligands for PTGiR or PtgeR4 as there are no known endogenous antagonists for these receptors. Many human GPCRs remain orphan receptors lacking known endogenous ligands. Ligands for at least some of these receptors will undoubtedly be found among the small molecules produced by the human microbiota. G2A is an orphan receptor and therefore does not have a well-defined endogenous agonist, although it has been reported to respond to lysophosphatidylcholine. 7, 18 We found that the bacterial metabolites N-3- hydroxypalmitoyl glycine(commendamide) and N-palmitoyl alanine, both activate g2A Nature. Author manuscript; available in PMC 2018 February 28
Based on data from the Human Protein Atlas (HPA) GPCRs targeted by human microbial Nacyl amides are localized to the gastrointestinal tract and its associated immune cells. In mouse models, this collection of gastrointestinal tract localized GPCRs have been reported to affect diverse mucosal functions including metabolism (GPR119), immune cell differentiation (S1PR4, PTGIR, PTGER4), immune cell trafficking (S1PR4, G2A) and tissue repair (PTGIR).9–14 It is not possible at this time to look for co-localization of GPCR and hm-NAS gene expression in specific gastrointestinal niches, as neither the HMP nor the HPA are sufficiently comprehensive in their survey of human body sites. Nonetheless, 16S and metagenomic deep sequencing studies link bacteria containing hm-NAS genes or hmNAS genes themselves to specific locations in the gastrointestinal tract where GPCRs of interest are expressed (Extended Data Fig. 4). Bacterial and human ligands share structure and function Human microbiota-encoded N-acyl amides bear structural similarity to endogenous GPCRactive ligands (Fig. 4). The clearest overlap in structure and function between bacterial and human GPCR-active ligands is for the endocannabinoid receptor GPR119 (Fig. 4 and 5). Endogenous GPR119 ligands include oleoylethanolamide (OEA) and the dietary lipid derivative 2-oleoyl glycerol (2-OG).15,16 In our heterologous expression experiment we isolated both the palmitoyl and oleoyl analogs of N-acyl serinol. The latter only differs from 2-OG by the presence of an amide instead of an ester and from OEA by the presence of an additional ethanol substituent. N-oleoyl serinol is a similarly potent GPR119 agonist compared to the endogenous ligand OEA (EC50 12 µM vs. 7 µM) but elicits almost a 2-fold greater maximum GPR119 activation (Fig. 5a). N-palmitoyl derivatives of all 20 natural amino acids were synthesized and none activated GPR119 by more than 37% relative to OEA (Fig. 5b). The generation of a potent and specific long-chain N-acyl-based GPR119 ligand therefore necessitates a more complex biosynthesis than the simple N-acylation of an amino acid as is commonly seen for characterized NAS enzymes. In this case, the biosynthesis of N-acyl serinols is achieved through the coupling of an NAS domain with an aminotransferase that is predicted to generate serinol from glycerol (Extended Data Fig. 2). The endogenous agonist for S1PR4, sphingosine-1-phosphate (S1P) and the N-3- hydroxypalmitoyl ornithine/lysine family of bacterial agonists share similar head group charges. S1P is a significantly more potent agonist (EC50 0.09 µM vs. EC50 32 µM); however, the bacterial agonists are more specific for S1PR4. The bacterial N-3- hydroxypalmitoyl ornithine did not activate S1PR1, 2, or 3 in our GPCR screen, whereas S1P activates all four members of the S1P receptor family tested. No direct comparison could be made between the microbiota-derived and endogenous ligands for PTGIR or PTGER4, as there are no known endogenous antagonists for these receptors. Many human GPCRs remain orphan receptors lacking known endogenous ligands. Ligands for at least some of these receptors will undoubtedly be found among the small molecules produced by the human microbiota. G2A is an orphan receptor and therefore does not have a well-defined endogenous agonist, although it has been reported to respond to lysophosphatidylcholine.17,18 We found that the bacterial metabolites N-3- hydroxypalmitoyl glycine (commendamide) and N-palmitoyl alanine, both activate G2A. Cohen et al. Page 5 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al Mammals produce N-palmitoyl glycine, which differs from commendamide by of the B-hydroxyl and, based on our synthetic N-acyl studies, activates G2A. 19 he absence GPRI19 is the most extensively studied of the GPCRs activated by bacterial ligands we identified (Fig 4) Mechanisms that link endogenous GPRI19 agonists(OEA, 2-OG)to changes in host phenotype are well defined as a result of the exploration of GPri19 as a therapeutic target for diabetes and obesity. 0--3GPR119 agonists are thought to primarily affect glucose homeostasis but also gastric emptying and appetite through both gPr119- dependent hormone release from enteroendocrine cells(GLP-1, GIP, PYY)and pancreatic B-cells(insulin) as well as GPrI19-independent mechanisms including PPARa modulation 9, 16, 24-30 Murine enteroendocrine GLUTag cells have been used as a model system for measuring the ability of potential GPri19 agonists to induce GLP-1 release. When administered to gLUTag cells at equimolar concentrations, microbiota-encoded N-oleoyl serinol or the endogenous ligands oEa and 2-oG induced GLP-I secretion to the same magnitude(Fig. 5c). To provide an orthogonal measurement of GPRi19 activation by M acyl serinols, HEK293 cells were stably transfected with a GPR119 expression construct Both Oea and n-oleoyl serinol increased cellular cAMP concentrations in a gPr119 dependent fashion(Extended Data Fig 5) hm-NAS expression alters blood glucose in mice The functional overlap between endogenous and bacterial metabolites suggested that bacteria expressing microbiota-encoded GPRi19 ligands might elicit host phenotypes that mimic those induced by eukaryotic ligands. Endogenous and synthetic GPRi19 ligands have been associated with changes in glucose homeostasis that are relevant to the etiology and treatment of diabetes and obesity including a study where mice were orally administered bacteria engineered to produce a eukaryotic enzyme that increases endogenous GPR119 ligand(OEA)precursors. 9, 16, 24-27, 31 The metabolic effect of the endogenous GPR119 ligands is believed to occur at the intestinal mucosa as the delivery of OEA intravenously fails to lower blood glucose in mice during an oral glucose tolerance test(OGTT) Consequently, we sought to determine whether mice colonized with bacteria engineered to produce human microbiota N-acyl serinols would exhibit predictable host phenotypes notobiotic mice were colonized with E coli engineered to express the N-acyl serinol synthase gene in an IPTG dependent manner. Control mice were colonized with E. coli containing an empty vector. Based on the number of colony forming units detected in fecal pellets both cohorts of mice were colonized to the same extent(Extended Data Fig. 7).After one week of exposure to IPTG both cohorts were fasted overnight and subjected to an OGTT. At 30 minutes post challenge we observed a statistically significant decrease in blood glucose levels for the group colonized with E. coli expressing the N-acyl serinol synthase gene(Fig. 5d). MS analysis of metabolites present in cecal samples revealed the presence of N-acyl serinols in the treatment cohort but not in the control cohort(Extended Data Fig. 6). After two weeks of withdrawing IPTG from the drinking water we no longer observed a difference in blood glucose between the two cohorts in an OGTT(Fig. 5e).32 To further explore the metabolic phenotype induced by N-acyl serinols we repeated OGTT experiment in an antibiotic treated mouse model. In this study we compared mice Nature. Author manuscript; available in PMC 2018 February 28
Mammals produce N-palmitoyl glycine, which differs from commendamide by the absence of the β-hydroxyl and, based on our synthetic N-acyl studies, activates G2A.19 GPR119 is the most extensively studied of the GPCRs activated by bacterial ligands we identified (Fig 4). Mechanisms that link endogenous GPR119 agonists (OEA, 2-OG) to changes in host phenotype are well defined as a result of the exploration of GPR119 as a therapeutic target for diabetes and obesity.20–23 GPR119 agonists are thought to primarily affect glucose homeostasis but also gastric emptying and appetite through both GPR119- dependent hormone release from enteroendocrine cells (GLP-1, GIP, PYY) and pancreatic β-cells (insulin) as well as GPR119-independent mechanisms including PPARα modulation. 9,16,24–30 Murine enteroendocrine GLUTag cells have been used as a model system for measuring the ability of potential GPR119 agonists to induce GLP-1 release. When administered to GLUTag cells at equimolar concentrations, microbiota-encoded N-oleoyl serinol or the endogenous ligands OEA and 2-OG induced GLP-1 secretion to the same magnitude (Fig. 5c). To provide an orthogonal measurement of GPR119 activation by Nacyl serinols, HEK293 cells were stably transfected with a GPR119 expression construct. Both OEA and N-oleoyl serinol increased cellular cAMP concentrations in a GPR119 dependent fashion (Extended Data Fig 5). hm-NAS expression alters blood glucose in mice The functional overlap between endogenous and bacterial metabolites suggested that bacteria expressing microbiota-encoded GPR119 ligands might elicit host phenotypes that mimic those induced by eukaryotic ligands. Endogenous and synthetic GPR119 ligands have been associated with changes in glucose homeostasis that are relevant to the etiology and treatment of diabetes and obesity including a study where mice were orally administered bacteria engineered to produce a eukaryotic enzyme that increases endogenous GPR119 ligand (OEA) precursors.9,16,24–27,31 The metabolic effect of the endogenous GPR119 ligands is believed to occur at the intestinal mucosa as the delivery of OEA intravenously fails to lower blood glucose in mice during an oral glucose tolerance test (OGTT).26 Consequently, we sought to determine whether mice colonized with bacteria engineered to produce human microbiota N-acyl serinols would exhibit predictable host phenotypes. notobiotic mice were colonized with E. coli engineered to express the N-acyl serinol synthase gene in an IPTG dependent manner. Control mice were colonized with E. coli containing an empty vector. Based on the number of colony forming units detected in fecal pellets both cohorts of mice were colonized to the same extent (Extended Data Fig. 7). After one week of exposure to IPTG both cohorts were fasted overnight and subjected to an OGTT. At 30 minutes post challenge we observed a statistically significant decrease in blood glucose levels for the group colonized with E. coli expressing the N-acyl serinol synthase gene (Fig. 5d). MS analysis of metabolites present in cecal samples revealed the presence of N-acyl serinols in the treatment cohort but not in the control cohort (Extended Data Fig. 6). After two weeks of withdrawing IPTG from the drinking water we no longer observed a difference in blood glucose between the two cohorts in an OGTT (Fig. 5e).32 To further explore the metabolic phenotype induced by N-acyl serinols we repeated the OGTT experiment in an antibiotic treated mouse model. In this study we compared mice Cohen et al. Page 6 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al colonized with E coll expressing an active N-acyl serinol synthase to mice colonized with E. coli expressing an NAS point mutant(Extended Data Fig. 8, p. Glu91Ala) that no longer produced N-acyl serinols. In this model the glucose lowering effect of colonization with N- acyl serinol producing E. coli remained significant(Fig. 5f). In the antibiotic treated mice we measured GLP-I and insulin concentrations after glucose gavage. Both hormones were gnificantly increased in the treatment group compared to the control group(Fig. 5 g, h). In all mouse models the observed correlation between hm-NAS gene induction and increased glucose tolerance is similar in magnitude to several studies with small molecule GPri19 agonists including glyburide, an FDA approved therapeutic for diabetes. 24. 2 Discussion ur characterization of human microbial N-acyl amides, together with other investiga of the human microbiota, suggests that host-microbial interactions may rely heavily on many human signaling systems(e.g, neurotransmitters, bioactive lipids, glycans). This is not surprising, as the genomes of the bacterial taxa common to the human gastrointestinal tract(e.g, Bacteroidetes, Firmicutes and Proteobacteria) are often lacking in gene clusters that encode for the production of complex secondary metabolites(e.g, polyketides nonribosomal peptides, terpenes). It appears that biosynthesis of endogenous mammalian produced by the human modest manipulation of primary metabolites. As a result, the structural conservation between metabolites used in host-microbial interactions and endogenous mammalian signaling metabolites may be a common phenomenon in the human microbiome Evolutionarily, the convergence of bacterial and human signaling systems through structurally related GPCR ligands is not unreasonable as GPCRs are thought to have developed in eukaryotes to allow for structurally simple signaling molecules to regulate increasingly complex cellular interactions. 33-33The structural similarities between microbiota-encoded N-acyl amides and endogenous GPCR-active lipids may be indicative of a broader structural and functional overlap among bacterial and human bioactive lipids including other GPCR-active N-acyl amides, eiconasoids(prostaglandins, leukotrienes)and sphingolipids Sphingolipid based signaling molecules may also be common in the human microbiome as prevalent bacterial species are known to synthesize membrane sphingolipids. The GPCRs with which bacterial N-acyl amides were found to interact are all part of the same"lipid-like " GPCR gene family. The potential importance of this GPCR family to the regulation of host-microbial interactions is suggested by their localization to areas of gastrointestinal track enriched in bacteria that are predicted to synthesize GPCr ligands (Extended Data Fig 4). Lipid-like GPCRs have been shown to play roles in disease models that are correlated with changes in microbial ecology including colitis(SIPR4, PTGir PTGER4), obesity(GPRi19), diabetes( GPri19), autoimmunity(G2A) and atherosclerosis (G2A, PTGIR). 9,10, 13, 14 The fact that the expression of an NAS gene in a gastrointestinal nizing bacterium is sufficient to alter host phys between lipid-like GPCRs and their N-acyl amide ligands could be relevant to human physiology and warrants further study. By LCMS analysis we observed most of the
colonized with E. coli expressing an active N-acyl serinol synthase to mice colonized with E. coli expressing an NAS point mutant (Extended Data Fig. 8, p.Glu91Ala) that no longer produced N-acyl serinols. In this model the glucose lowering effect of colonization with Nacyl serinol producing E. coli remained significant (Fig. 5f). In the antibiotic treated mice we measured GLP-1 and insulin concentrations after glucose gavage. Both hormones were significantly increased in the treatment group compared to the control group (Fig. 5 g, h). In all mouse models the observed correlation between hm-NAS gene induction and increased glucose tolerance is similar in magnitude to several studies with small molecule GPR119 agonists including glyburide, an FDA approved therapeutic for diabetes.24,25 Discussion Our characterization of human microbial N-acyl amides, together with other investigations of the human microbiota, suggests that host-microbial interactions may rely heavily on simple metabolites built from the same common lipids, sugars, and peptides that define many human signaling systems (e.g., neurotransmitters, bioactive lipids, glycans). This is not surprising, as the genomes of the bacterial taxa common to the human gastrointestinal tract (e.g., Bacteroidetes, Firmicutes and Proteobacteria) are often lacking in gene clusters that encode for the production of complex secondary metabolites (e.g., polyketides, nonribosomal peptides, terpenes). It appears that biosynthesis of endogenous mammalian signaling molecules as well as those produced by the human microbiota may rely on the modest manipulation of primary metabolites. As a result, the structural conservation between metabolites used in host-microbial interactions and endogenous mammalian signaling metabolites may be a common phenomenon in the human microbiome. Evolutionarily, the convergence of bacterial and human signaling systems through structurally related GPCR ligands is not unreasonable as GPCRs are thought to have developed in eukaryotes to allow for structurally simple signaling molecules to regulate increasingly complex cellular interactions.33–35 The structural similarities between microbiota-encoded N-acyl amides and endogenous GPCR-active lipids may be indicative of a broader structural and functional overlap among bacterial and human bioactive lipids including other GPCR-active N-acyl amides, eiconasoids (prostaglandins, leukotrienes) and sphingolipids. Sphingolipid based signaling molecules may also be common in the human microbiome as prevalent bacterial species are known to synthesize membrane sphingolipids. 36 The GPCRs with which bacterial N-acyl amides were found to interact are all part of the same “lipid-like” GPCR gene family. The potential importance of this GPCR family to the regulation of host-microbial interactions is suggested by their localization to areas of gastrointestinal track enriched in bacteria that are predicted to synthesize GPCR ligands (Extended Data Fig. 4). Lipid-like GPCRs have been shown to play roles in disease models that are correlated with changes in microbial ecology including colitis (S1PR4, PTGIR, PTGER4), obesity (GPR119), diabetes (GPR119), autoimmunity (G2A) and atherosclerosis (G2A, PTGIR).9,10,13,14 The fact that the expression of an NAS gene in a gastrointestinal colonizing bacterium is sufficient to alter host physiology suggests that the interaction between lipid-like GPCRs and their N-acyl amide ligands could be relevant to human physiology and warrants further study. By LCMS analysis we observed most of the Cohen et al. Page 7 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al microbiota encoded N-acyl amides reported here in human stool samples(Extended Data ig. 9). Further studies will be needed to better define the distribution and concentration of these metabolites throughout the gastrointestinal tract especially at the mucosa where the physiologic activity of these metabolites likely occurs. Interestingly, Gemella spp. predicted to encode N-acyl serinols are tightly associated with the small intestinal mucosa supporting this site as a potentially important location for N-acyl amide mediated interactions. 37As the mouse model system used here relies on induced expression of NAS genes it will also be mportant to understand how these genes are natively regulated Current strategies for treating diseases associated with the microbiome such as inflammatory owel disease or diabetes are not believed to address the dysfunction of the host-microbial interactions that are likely part of the disease pathogenesis. Bacteria engineered to delive bioactive small molecules produced by the human microbiota have the potential to help ddress diseases of the microbiome by modulating the native distribution and abundance of these metabolites. Regulation of GPCRs by microbiota-derived N-acyl amides is a particularly attractive therapeutic strategy for the treatment of human diseases as gPCrs have been extensively validated as therapeutic targets. As our mechanistic understanding of how human microbiota-encoded small molecules effect changes in host physiology grows the potential for using"microbiome-biosynthetic-gene-therapy to treat human disease by complementing small molecule deficiencies in native host-microbial interactions with microbiota derived biosynthetic genes should increase accordingly. The use of functional metagenomics to identify microbiota encoded effectors combined with bioinformatics and synthetic biology to expand effector molecule families provides a generalizable platform to help define the role microbiota-encoded small molecules play in host-microbial interactions Methods Bioinformatics analysis of human N-acyl synthase genes Protein sequences for members of the PFAM family 13444 Acetyltransferase(GNAT domain(http://pfam.xfam,org/family/pf13444)(n=689)weredownloadedand corresponding gene sequences identified based on European Bioinformatics Institute(EBD) numberAmultiplesequencealignmentwasperformedusingClustalOmega(http:// www.ebi.ac.uk/tools/msa/clustalo/),generatingaphylogenetictreeinNewickformatwith the"--guidetree-outoption The 689 PFAM sequences were queried against the Human Microbiome Project(HMP)clustered gene index datasets and reference genome datasets with BlasTn(htTp: //hmpdacc. org/hmgc/). The Pfam13444 sequences that aligned to a HMP gene [expectation(E)value 70% identity] were identified and comprise the human N-acyl synthase(hm-NAS) gene dataset (143 hm-NAS genes). Reference genomes for 111/143 hm-NAS genes were identified (Supplementary Table 2) To determine the abundance of hm-NAS genes within microbiomes at specific human body sites, hm-NAS genes were queried against HMP whole metagenome shotgun sequencing gene wo BLASTN searched against the non-redundant gene sets from the following body sites buccal mucosa, anterior nares, mid vagina, posterior fornix, vaginal introitus, retroauricular crease(combined left and right, stool, supragingival plaque and tongue dorsum. These body Nature. Author manuscript; available in PMC 2018 February 28
microbiota encoded N-acyl amides reported here in human stool samples (Extended Data Fig. 9). Further studies will be needed to better define the distribution and concentration of these metabolites throughout the gastrointestinal tract especially at the mucosa where the physiologic activity of these metabolites likely occurs. Interestingly, Gemella spp. predicted to encode N-acyl serinols are tightly associated with the small intestinal mucosa supporting this site as a potentially important location for N-acyl amide mediated interactions.37 As the mouse model system used here relies on induced expression of NAS genes it will also be important to understand how these genes are natively regulated. Current strategies for treating diseases associated with the microbiome such as inflammatory bowel disease or diabetes are not believed to address the dysfunction of the host-microbial interactions that are likely part of the disease pathogenesis. Bacteria engineered to deliver bioactive small molecules produced by the human microbiota have the potential to help address diseases of the microbiome by modulating the native distribution and abundance of these metabolites. Regulation of GPCRs by microbiota-derived N-acyl amides is a particularly attractive therapeutic strategy for the treatment of human diseases as GPCRs have been extensively validated as therapeutic targets. As our mechanistic understanding of how human microbiota-encoded small molecules effect changes in host physiology grows, the potential for using “microbiome-biosynthetic-gene-therapy” to treat human disease by complementing small molecule deficiencies in native host-microbial interactions with microbiota derived biosynthetic genes should increase accordingly. The use of functional metagenomics to identify microbiota encoded effectors combined with bioinformatics and synthetic biology to expand effector molecule families provides a generalizable platform to help define the role microbiota-encoded small molecules play in host-microbial interactions. Methods Bioinformatics analysis of human N-acyl synthase genes Protein sequences for members of the PFAM family 13444 Acetyltransferase (GNAT) domain (http://pfam.xfam.org/family/PF13444) (n=689) were downloaded and corresponding gene sequences identified based on European Bioinformatics Institute (EBI) number. A multiple sequence alignment was performed using Clustal Omega (http:// www.ebi.ac.uk/Tools/msa/clustalo/), generating a phylogenetic tree in Newick format with the “--guidetree-out” option. The 689 PFAM sequences were queried against the Human Microbiome Project (HMP) clustered gene index datasets and reference genome datasets with BLASTN (http://hmpdacc.org/HMGC/). The PFAM13444 sequences that aligned to a HMP gene [expectation (E) value 70% identity] were identified and comprise the human N-acyl synthase (hm-NAS) gene dataset (143 hm-NAS genes). Reference genomes for 111/143 hm-NAS genes were identified (Supplementary Table 2). To determine the abundance of hm-NAS genes within microbiomes at specific human body sites, hm-NAS genes were queried against HMP whole metagenome shotgun sequencing data on a per body site basis (http://hmpdacc.org/HMASM/). Each hm-NAS gene was BLASTN searched against the non-redundant gene sets from the following body sites: buccal mucosa, anterior nares, mid vagina, posterior fornix, vaginal introitus, retroauricular crease (combined left and right), stool, supragingival plaque and tongue dorsum. These body Cohen et al. Page 8 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
Cohen et al sites were chosen because they contained sequence data from the largest number of unique patients.38hm-NAS genes and highly similar genes in the hMP non redundant gene set(E- value <e 4 )were aligned to shotgun sequencing reads from each patient sample taken from different sites in the human microbiome Aligned reads were normalized to hm-NAS gene length and sequencing depth of each dataset. The normalized count of the reads aligned to each hm-NAS gene or its highly similar gene from the HMP non redundant gene set were scaled [O-l] and color coded per body site, and added as concentric rings around the phylogenetic tree(Fig. 1A). To determine the variability and distribution of hm-NAS gene hat correspond to specific N-acyl amide families 1-6(Fig. 1)in the human microbiome normalized read counts for hm-NAS gene from each N-acyl amide family were plotted separately per body site as Reads per Kilobase of Gene Per Million Reads(RPKM)(Fig Ic).ThetreeinFigureIwasplottedusinggraphlan(https://huttenhower.sph.harvard.edu/ Analysis of metatranscriptome datasets Two RNAseq datasets were identified with multiple patient samples taken from separate sites in the human microbiome.39, 40 One RnaSeq dataset was part of the Hmp(Http: // hmpdacc. org/RSEQ/) and generated from supragingival samples taken from twin pairs with and without dental caries. The second RNAseq dataset was generated from stool samples and compared different RNA extraction methods. We used only samples labeled"whole which functioned as controls for the original study. Alignment of all hm-NAS genes to each dataset only identified hm-NAS genes from N-acyl amide family I and 2 in each of the RNASeq datasets(I in stool, 2 in supragingival plaque). To explore whether hm-NAS gene expression might vary in patient samples we performed two different analyses. In the first analysis we identified reference genomes containing hm-NAS genes identical to those we used in heterologous expression experiments for molecule families I and 2 (Bacteroides dorer for compound 1, Capnocytophaga ochracea for compound 2). RNAseq reads were aligned to all of the genes from each reference genome. For each genome the average per gene read density normalized for gene length was compared to the read density seen for the hm-NAS gene. The percentile of the normalized expression of each hm-NAS gene was then plotted(0 for not expressed, I for the most expressed) and compared between patient samples for each RNAseq dataset(Fig. 2a). In the second analysis the direct correlation between DNA and RNa abundance was determined for the stool metatranscriptome datas for which DNA reads were also available.39 RNAseq and shotgun-sequenced dNA reads were aligned to the 15 hm-NAS genes from N-acyl amide family I that encoded for N-acyl glycines(Supplementary Table 1). The reads were normalized(rPKm) and each hm-NAS gene from each patient sample was plotted as a single point with DNA and rNa read counts on the X and Y axis( Fig. 2b) Heterologous expression of PFAM13444 genes in Escherichia coli The 44 hm-NAS genes we examined by heterologous expression were codon optimized, appended with Ncol and Ndel sites at the n and c terminus respectively and synthesized by Geng Genes obtained from Gen were digested with Ndel and Ncol and ligated into the orresponding restriction sites in pET28c(Novagen). For heterologous expression purposes the resulting constructs were transformed into E. coli EC100 containing the T7 polymerase Nature. Author manuscript; available in PMC 2018 February 28
sites were chosen because they contained sequence data from the largest number of unique patients.38 hm-NAS genes and highly similar genes in the HMP non redundant gene set (Evalue < e−40) were aligned to shotgun sequencing reads from each patient sample taken from different sites in the human microbiome. Aligned reads were normalized to hm-NAS gene length and sequencing depth of each dataset. The normalized count of the reads aligned to each hm-NAS gene or its highly similar gene from the HMP non redundant gene set were scaled [0–1] and color coded per body site, and added as concentric rings around the phylogenetic tree (Fig. 1A). To determine the variability and distribution of hm-NAS genes that correspond to specific N-acyl amide families 1–6 (Fig. 1) in the human microbiome normalized read counts for hm-NAS gene from each N-acyl amide family were plotted separately per body site as Reads per Kilobase of Gene Per Million Reads (RPKM) (Fig. 1c). The tree in Figure 1 was plotted using graphlan (https://huttenhower.sph.harvard.edu/ graphlan). Analysis of metatranscriptome datasets Two RNAseq datasets were identified with multiple patient samples taken from separate sites in the human microbiome.39,40 One RNAseq dataset was part of the HMP (http:// hmpdacc.org/RSEQ/) and generated from supragingival samples taken from twin pairs with and without dental caries. The second RNAseq dataset was generated from stool samples and compared different RNA extraction methods. We used only samples labeled “whole” which functioned as controls for the original study.39 Alignment of all hm-NAS genes to each dataset only identified hm-NAS genes from N-acyl amide family 1 and 2 in each of the RNAseq datasets (1 in stool, 2 in supragingival plaque). To explore whether hm-NAS gene expression might vary in patient samples we performed two different analyses. In the first analysis we identified reference genomes containing hm-NAS genes identical to those we used in heterologous expression experiments for molecule families 1 and 2 (Bacteroides dorei for compound 1, Capnocytophaga ochracea for compound 2). RNAseq reads were aligned to all of the genes from each reference genome. For each genome the average per gene read density normalized for gene length was compared to the read density seen for the hm-NAS gene. The percentile of the normalized expression of each hm-NAS gene was then plotted (0 for not expressed, 1 for the most expressed) and compared between patient samples for each RNAseq dataset (Fig. 2a). In the second analysis the direct correlation between DNA and RNA abundance was determined for the stool metatranscriptome dataset for which DNA reads were also available.39 RNAseq and shotgun-sequenced DNA reads were aligned to the 15 hm-NAS genes from N-acyl amide family 1 that encoded for N-acyl glycines (Supplementary Table 1). The reads were normalized (RPKM) and each hm-NAS gene from each patient sample was plotted as a single point with DNA and RNA read counts on the X and Y axis (Fig. 2b). Heterologous expression of PFAM13444 genes in Escherichia coli The 44 hm-NAS genes we examined by heterologous expression were codon optimized, appended with NcoI and NdeI sites at the N and C terminus respectively and synthesized by Gen9. Genes obtained from Gen9 were digested with NdeI and NcoI and ligated into the corresponding restriction sites in pET28c (Novagen). For heterologous expression purposes the resulting constructs were transformed into E. coli EC100 containing the T7 polymerase Cohen et al. Page 9 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript
gene integrated into its genome(E coli EC100: DE3). E. coli EC100: DE3 hm-NAs containing strains were inoculated into 10 ml of Luria-Bertani (lB) broth supplemented with kanamycin(50 ug/ml)and grown overnight (37C with shaking 200 rpm). One ml of overnight culture was used to inoculate 50 ml of LB supplemented with kanamycin(50 ug/ml)and isopropyl B-D-1-thiogalactopyranoside(IPTG)(25 HM). Cultures were incubated at 30C for 4 days with shaking(200 rpm). Each culture broth was extracted with an equal volume of ethyl acetate and the resulting crude extracts were dried in vacuo. Crude extracts were resuspended in 50 uL of methanol and analyzed by reversed phase HPLC-MS (XBridgeTm C18 4.6 mm x 150 mm)using a binary solvent system(A/B solvent of water/ acetonitrile with 0. 1% formic acid: 10% B isocratic for 5 minutes, gradient 10% to 100% B over 25 minutes). Clone specific metabolites encoded by each hm-NAS gene were identified by comparing experimental extracts with extracts prepared from cultures of E. coln EC100 DE3 transformed with an empty pET28c vector N-acyl amide isolation and structure determination For each group of clones that, based on LCMS analysis, were predicted to produce a different N-acyl amide family we chose one representative clone for use in molecule solation studies. Each representative clone was grown in 1. 5L of LB in a 2.7L Fernach flask (30C, 200 RPM). After 4 days, cultures were extracted 2 times with an equal volume of ethyl acetate. Dried ethyl acetate extracts were partitioned by reversed phase flash chromatography (Teledyne Isco, CI8 RediSep rF gold m 15 g)using the following mobile phase conditions: water acetonitrile with 0. 1% formic acid, 10% acetonitrile isocratic for 5 minutes, gradient to 100% acetonitrile over 20 minutes(30 mL/minute). Fractions containing clone specific metabolites were pooled and semi preparative reversed phase HPLC was used to separate individual N-acyl amide molecules( Supplementary Information). The structures of compounds 2-6 were determined using a combination of HRMS,H, C, and 2D NMR data( Supplementary Information ). Compound 1 was described in our previous study. hm-NAS gene containing bacterial species culture broth analysis Capnocytophaga ochracea F0287(compound 2), Klebsiella pneumoniae WGLWI-5 (compound 3), Neisseria flavescens SKI 14(compounds 4a and 4b), and gemella haemolysans M341(compound 6)were obtained from the Biodefense and Emerging Infections Research Resources Repository(BEl Resources)HMP catalogue. Compound 1 was previously identified in culture broth extracts from cultures of Bacteroides vulgatus. 3 Each chosen bacteria contains an hm-NAS gene related to that which was heterologously expressed to produce compound 2, 3, 4a, 4b or 6. Strains were inoculated under sterile conditions into 2 L of LY BHI medium[brain-heart infusion medium supplemented with .5% yeast extract(Difco), 5 mg/L hemin (Sigma), I mg/ml cellobiose(Sigma), I mg/ml maltose(Sigma), 0.5 mg/ml cysteine( Sigma)] and grown anaerobically(C ochracea)or aerobically (N. flavescens, G. haemolysans, K. pneumoniae) for 7 days. Culture broths were extracted with an equal volume of ethyl acetate. To look for the presence of N-acyl amides these extracts were examined by HPLC-MS as was done in the original heterologous expression experiments. With the exception of family 3, the N-acyl metabolite that was Nature. Author manuscript; available in PMC 2018 February 28
gene integrated into its genome (E. coli EC100:DE3). E. coli EC100:DE3 hm-NAS containing strains were inoculated into 10 ml of Luria-Bertani (LB) broth supplemented with kanamycin (50 µg/ml) and grown overnight (37 °C with shaking 200 rpm). One ml of overnight culture was used to inoculate 50 ml of LB supplemented with kanamycin (50 µg/ml) and isopropyl β-D-1-thiogalactopyranoside (IPTG) (25 µM). Cultures were incubated at 30 °C for 4 days with shaking (200 rpm). Each culture broth was extracted with an equal volume of ethyl acetate and the resulting crude extracts were dried in vacuo. Crude extracts were resuspended in 50 µL of methanol and analyzed by reversed phase HPLC-MS (XBridgeTm C18 4.6 mm × 150 mm) using a binary solvent system (A/B solvent of water/ acetonitrile with 0.1% formic acid: 10% B isocratic for 5 minutes, gradient 10% to 100% B over 25 minutes). Clone specific metabolites encoded by each hm-NAS gene were identified by comparing experimental extracts with extracts prepared from cultures of E. coli EC100:DE3 transformed with an empty pET28c vector. N-acyl amide isolation and structure determination For each group of clones that, based on LCMS analysis, were predicted to produce a different N-acyl amide family we chose one representative clone for use in molecule isolation studies. Each representative clone was grown in 1.5 L of LB in a 2.7 L Fernach flask (30 °C, 200 RPM). After 4 days, cultures were extracted 2 times with an equal volume of ethyl acetate. Dried ethyl acetate extracts were partitioned by reversed phase flash chromatography (Teledyne Isco, C18 RediSep RF GoldTm 15 g) using the following mobile phase conditions: water:acetonitrile with 0.1% formic acid, 10% acetonitrile isocratic for 5 minutes, gradient to 100% acetonitrile over 20 minutes (30 mL/minute). Fractions containing clone specific metabolites were pooled and semi preparative reversed phase HPLC was used to separate individual N-acyl amide molecules (Supplementary Information). The structures of compounds 2–6 were determined using a combination of HRMS, 1H, 13C, and 2D NMR data (Supplementary Information). Compound 1 was described in our previous study.3 hm-NAS gene containing bacterial species culture broth analysis Capnocytophaga ochracea F0287 (compound 2), Klebsiella pneumoniae WGLW1–5 (compound 3), Neisseria flavescens SK114 (compounds 4a and 4b), and Gemella haemolysans M341 (compound 6) were obtained from the Biodefense and Emerging Infections Research Resources Repository (BEI Resources) HMP catalogue. Compound 1 was previously identified in culture broth extracts from cultures of Bacteroides vulgatus. 3 Each chosen bacteria contains an hm-NAS gene related to that which was heterologously expressed to produce compound 2, 3, 4a, 4b or 6. Strains were inoculated under sterile conditions into 2 L of LYBHI medium [brain–heart infusion medium supplemented with 0.5% yeast extract (Difco), 5 mg/L hemin (Sigma), 1 mg/ml cellobiose (Sigma), 1 mg/ml maltose (Sigma), 0.5 mg/ml cysteine (Sigma)] and grown anaerobically (C. ochracea) or aerobically (N. flavescens, G. haemolysans, K. pneumoniae) for 7 days. Culture broths were extracted with an equal volume of ethyl acetate. To look for the presence of N-acyl amides these extracts were examined by HPLC-MS as was done in the original heterologous expression experiments. With the exception of family 3, the N-acyl metabolite that was Cohen et al. Page 10 Nature. Author manuscript; available in PMC 2018 February 28. Author Manuscript Author Manuscript Author Manuscript Author Manuscript