E.Dugat-Bory,ct al Food Microbiology BS (2020)103278 pooled thirteen ced samples.Therefore,reads from identical (Table1).We observed1×10to4×101on article m of phages and pre n se al s les were binn v3.1308h rved in samp epen 0 code an in Sain ed t-test for t Saint-Ne than in C which d 92 ira nden 005)Camember had the high ria 1) after hi e placed 78c e lysis of 1 ry c in interf the tho ly observed for virt as circula 1.col n 3).It has (6n st-tra in the sed as quality ntigs v zed using P the effec database using Blas Putative ling co (P1 he124 v2 26 with defaul Fig 2.red arrow).This ban table.Pr 1261 P4) In the effect of the tion proc e on the with tte plot the r in s Teated with ci cont (P1 ng 2012)and the SILV n ch 10A pp rands. eshold 1,sin reshold-0 d P3) 4.949_Epvirl and Epv ir8. d that viruses were mor ions of the predicted eins in the contis ad ther.these re ggest that filtrat ion isan important step P (E- e m nd to the PDBd the is very 2.9.Accession numbers robial DNA (see below),but at the cost o h of the epoiss numbers to(bioproject PRJNA497596). 3.Results Ne d Epoisses cheese for further virome analysis because i th high co nts of na es and bacteria (Table 1 and Tab ered from the cheese d as th eg ophag VA wa Among the four tested pr those ining chloroform using th four procedur and prior the gradient step. DN r( pooled thirteen sequenced samples. Therefore, reads from identical phages and present in several samples were binned together during the assembly process. Assembly was performed using SPAdes v3.13.0 with the meta option and increasing kmer values -k 21,33,55,77,99,127 (Bankevich et al., 2012). Contigs with length below 2500 bp, unlikely to encode complete viruses, were discarded. The resulting 910 filtered contigs were first analyzed with VirSorter v1.0.3 (Roux et al., 2015) using the Viromes database (all bacterial and archaeal virus genomes in Refseq (as of January 2014), together with non-redundant predicted genes from viral metagenomes), which returned 92 putative viral contigs (phages or prophages). A list of “most abundant and pertinent” contigs was constructed using three criteria: 1) detected as viral with VirSorter, after filtering out those placed in categories 3 or 6 (not so sure) with coverage below 10 (remaining 78 contigs). Coverage was used as a proxy for abundance instead of percentage of reads because this criterion is independent of the contig's size. Contigs with a coverage below 10 corresponded approximately to contigs present in the first quartile when ordered by coverage, so the 25% less abundant contigs. 2) Detected as circular contig, even if not detected as viral (adds 34 contigs). 3) Coverage above 100, even if not detected as viral or circular (final list of 124 contigs). Contigs with a coverage above 100 corresponded approximately to contigs present in the third quartile when ordered by coverage, so the 25% more abundant contigs. The 124 final contigs were further analyzed using PHASTER (Arndt et al., 2016) and compared to both the Viruses section of nucleotide database at NCBI and to the complete database using Blast with default parameters (Altschul et al., 1990) in order to provide a first characterization of putative phage-encoding contigs. Then, quality filtered reads (both paired and unpaired) were mapped individually for each sample against the 124 potential viral contigs using Bowtie2 aligner v2.2.6 with default parameters values (Langmead and Salzberg, 2012) in order to produce an abundance table. Principal coordinate analysis (PCoA) based on Bray-Curtis dissimilarity, composition analysis and heatmap were processed with the R package Phyloseq v1.26.1 (McMurdie and Holmes, 2013). In order to measure the effect of the extraction procedure on the composition of the viral metagenome, Spearman correlations were calculated for each pair of protocols (P1 vs P2, P1 vs P3, P1 vs P4, P2 vs P3, P2 vs P4 and P3 vs P4) on the mean log10 abundance values and visualized with a scatter plot using the ggpubr package v0.2 (Kassambara, 2018). The level of bacterial DNA contamination in the cheese viromes was estimated by detection of ribosomal DNAs among the reads using SortMeRNa v2.0 (Kopylova et al., 2012) and the SILVA v129 database (Quast et al., 2013) with default parameters (search in both strands, e-value threshold = 1, similarity threshold = 0.97, query coverage = 0.97). The three most prevalent contigs present in the final virome, namely Epvir4, 949_Epvir1 and Epvir8, were further annotated using RAST (Aziz et al., 2008) using genetic code 11 and “virus” options. Homologues of the predicted proteins in the contigs were additionally compared to the NCBI nr database with BlastP (E-value cutoff of 10−8 ) (Altschul et al., 1990), and to the PDB database using HHpred (Zimmermann et al., 2018) with a probability cutoff of 99%. 2.9. Accession numbers Raw sequence data were deposited at the Sequence Read Archive of the National Center for Biotechnology Information under the accession numbers SRR8080803 to SRR8080815 (bioproject PRJNA497596). 3. Results 3.1. Quantification and purity of viral particles recovered from the cheese surface Among the four tested procedures, only those containing chloroform treatments (P2 and P4) resulted in viral fractions sufficiently pure to allow nanoparticle counting using the interferometric light microscope (Table 1). We observed 1 × 109 to 4 × 1010 nanoparticles per gram of cheese surface. The counts observed in samples treated with P4 were significantly lower than those for samples treated with P2 (one-tailed ttest for two dependent means, p < 0.05) indicating a loss of particles due to the filtration step (Table 1). Counts were significantly higher in Camembert than in Saint-Nectaire cheese (one-tailed t-test for two independent means, p < 0.05). For Epoisses cheese, counts were slightly higher than in Saint-Nectaire and slightly lower than in Camembert but those differences were not statistically significant (one-tailed t-test for two independent means, p > 0.05). Camembert had the highest nanoparticles per microbial cell ratios. This type of cheese had the lowest bacterial counts (Table S2), whether such a community leads to higher cell lysis mediated by viruses or higher levels of membrane vesicles remains to be investigated. Viral fractions prepared using procedures P1 and P3, without chloroform treatment, were very dense, milk-white and led to hard noise in interferometer's films with a myriad of spots larger than those typically observed for viruses. Microbial cells clearly contaminated the viral fractions when using procedures P1 and P2, without filtration (Table 1, column 3). It has to be mentioned we also observed bacterial cells in one filtrated sample, i.e. Camembert cheese with P3, indicating probable post-filtration contamination. Two-layer iodixanol gradients were used as quality controls in order to separate microbial cells, debris and membrane vesicles from viruses, enabling to visually observe the effect of the four procedures on the quantity and purity of viral fractions (Fig. 2). For the three types of cheese tested, similar profiles were observed after ultracentrifugation. A strong band located at the top of the lightest density layer was present in samples prepared without filtration and chloroform treatment (P1) suggesting high contamination with microbial cells, debris and membrane vesicles (Fig. 2, red arrow). This band was still present – albeit slightly less intense – in filtered samples (P3), suggesting that cheese rind is rich in membrane vesicles. Finally, this band was completely absent from samples prepared with procedures including chloroform treatment (P2 and P4) confirming the efficiency of such treatment for removing membrane vesicles from viral fractions (Biller et al., 2017). A visible band containing viruses formed at the proximity of the 45% iodixanol cushion (Fig. 2, blue arrow), at a density corresponding to 40% iodixanol, as estimated with a refractometer. The virus band was barely visible, however, in samples treated with chloroform (P2 and P4) when compared to untreated samples (P1 and P3). Nanoparticles quantification using interferometry in the viral band of Epoisses samples after dialysis indicated that samples treated with cholorofom (P2 and P4) contained approximately ten times less nanoparticles than untreated samples (P1 and P3). For Camembert cheese and procedure P3, we observed two distinct bands near the 45% iodixanol cushion. Transmission electronic microscopy revealed that viruses were more abundant in the lowest one. Altogether, these results suggest that filtration is an important step for the extraction of viruses from cheese in order to deplete microbial cells, which represent the major source of DNA contamination in virome studies. Furthermore, chloroform treatment is very beneficial for viral fractions' cleaning and removal of membrane vesicles, which might contain pieces of microbial DNA (see below), but at the cost of reducing the nanoparticle recovery. 3.2. Effect of the extraction procedure on the composition of the epoisses virome We selected Epoisses cheese for further virome analysis because it had both high counts of nanoparticles and bacteria (Table 1 and Table S2), bacteria being expected as the main targets of microbial viruses (e.g. bacteriophages or phages) in the cheese ecosystem. DNA was successfully extracted from viral fractions produced from Epoisses cheese using the four procedures and prior the gradient step. DNA yields are available in Table S3. DNA samples were then sequenced, in order to assess the impact of both filtration and chloroform treatment E. Dugat-Bony, et al. Food Microbiology 85 (2020) 103278 4