Skip to main content

Research Repository

Advanced Search

De novo genome assembly and analysis unveil biosynthetic and metabolic potentials of Pseudomonas fragi A13BB. [Dataset]


Opeyemi K. Awolope
Data Collector

Noelle H. O'Driscoll
Data Collector

Alberto Di Salvo
Data Collector

Andrew J. Lamb
Data Collector


Objectives; The role of rhizosphere microbiome in supporting plant growth under biotic stress is well documented. Rhizobacteria ward off phytopathogens through various mechanisms including antibiosis. We sought to recover novel antibiotic-producing bacterial strains from soil samples collected from the rhizosphere. Pseudomonas fragi A13BB was recovered as part of this effort, and the whole genome was sequenced to facilitate mining for potential antibiotic-encoding biosynthetic gene clusters. Data description: Here, we report the complete genome sequence of P. fragi A13BB obtained from de novo assembly of Illumina MiSeq and GridION reads. The 4.94 Mb genome consists of a single chromosome with a GC content of 59.40%. Genomic features include 4410 CDSs, 102 RNAs, 3 CRISPR arrays, 3 prophage regions, and 37 predicted genomic islands. Two β-lactone biosynthetic gene clusters were identified; besides, metabolic products of these are known to show antibiotic and/or anticancer properties. A siderophore biosynthetic gene cluster was also identified even though P. fragi is considered a non-siderophore producing pseudomonad. Other gene clusters of broad interest identified include those associated with bioremediation, biocontrol, plant growth promotion, or environmental adaptation. This dataset unveils various un−/underexplored metabolic or biosynthetic potential of P. fragi and provides insight into molecular mechanisms underpinning these attributes.


AWOLOPE, O.K., O'DRISCOLL, N.H., DI SALVO, A. and LAMB, A.J. 2021. De novo genome assembly and analysis unveil biosynthetic and metabolic potentials of Pseudomonas fragi A13BB. [Dataset]. BMC genomic data [online], 22, article number 15. Available from:

Acceptance Date May 4, 2021
Online Publication Date May 18, 2021
Publication Date Dec 31, 2021
Deposit Date May 24, 2021
Publicly Available Date May 24, 2021
Publisher Springer Nature [academic journals on]
Keywords Pseudomonas fragi; β-Lactone antibiotics; Plant growth-promoting rhizobacteria; Rhizosphere microbiome
Public URL
Type of Data 6 PDF files, 3 PNG files, 1 Fastq file, 1 Fasta file and supporting text file.
Collection Date Dec 6, 2020
Collection Method P. fragi A13BB was isolated from the rhizosphere of a plant in Aberdeen, Scotland (57.101N 2.078W) using an ultra-minimal substrate medium (data file 1) [7]. Purified strain was cultivated in nutrient broth (Oxoid, UK) at 28°C for 24h before gDNA was extracted from pellets with the DNeasy® Ultraclean® Microbial Kit for DNA Isolation (Qiagen, UK). The extract was used as template to amplify the 16S rRNA gene in PCR reactions using 27F and U1510R universal primers, with thermocycler parameters set as follows: Initial denaturation at 95°C for 2min followed by 30 cycles of further denaturation at 95°C for 30s, primer annealing at 45°C for 30s and elongation at 72°C for 105s. A final elongation was carried out at 70°C for 5min. Amplified DNA fragment was sequenced using the 27F primer. Isolate was subsequently identified by 16S rRNA gene comparison as P. fragi with 99% identity score. Libraries were prepared for Illumina sequencing by Glasgow Polyomics (Glasgow, UK) using the Nextera XT DNA Library Preparation Kit (Illumina, USA) following manufacturer’s protocol, and sequenced with the Illumina MiSeq using a 300bp paired end protocol. Libraries were prepared for GridION sequencing by MicrobesNG (Birmingham, UK) using the Oxford nanopore SQK-RBK004 kit and/or SQK-LSK109 kit with Native Barcoding EXP-NBD104/114 (ONT, UK), and sequenced on a FLO-MIN106 (R.9.4 or R.9.4.1) flow cell in a GridION (ONT, UK). Illumina reads were trimmed with Trimmomatic [8] v0.36 operated in the sliding window mode with Q25 quality cut-off and minimum read length of 100. The quality of trimmed reads was assessed with FastQC [9] v0.11.8 and results were aggregated with MultiQC [10] v1.8 (data file 2) [11]. Mean quality score across each base position was ≥31. Quality assessment of GridION reads was performed with NanoPlot [12] v1.28.2. Quality statistics are summarised in data file 3 [13], while average read quality plot is displayed in data file 4 [14]. Paired short reads and long reads were assembled de novo with Unicycler [15] v0.4.8. Assembly quality was assessed with Quast [16] v5.0.2. Two contigs were identified (data file 5) [17], the smaller contig (5386bp) representing the complete genome of bacteriophage φX174 (control spike in Illumina sequencing) was subsequently extracted from the data. The larger contig (4,940,458bp) represents the complete genome of P. fragi A13BB with sequencing depths of 226x and 32x for Illumina and GridION sequencing, respectively. Assembly completeness was 99.2% as assessed with BUSCO [18] v4.1.2 using the pseudomanadales_odb10 lineage dataset (data file 6) [19]. Assembly graph was visualised with Bandage [20] and displayed in data file 7 [21]. ANI analysis with the FastANI tool [22] v1.3 confirmed identity as P. fragi with the ANI value of 98.9071. Gene and functional annotations were performed with PGAP [23] v4.13 and RASTtk [24] v2.0. Metabolic pathway analyses were performed using the KEGG database [25] Rel 93.0. CRISPRs were identified by CRISPRCasFinder [26], genomic islands were predicted by IslandViewer 4 [27], prophages were identified by PHASTER [28] and smBGCs were identified with antiSMASH [29] v5.1.2. All bioinformatics tools used for genome assembly and analyses were operated with default parameters or as specified in the text. The complete genome of P. fragi A13BB comprises a single chromosome 4,940,458bp in size with a GC content of 59.40%. Genomic features include 4410 CDSs, 25 rRNA, 73 tRNA, 4 ncRNA, 3 CRISPRs, 3 prophage regions and 37 predicted genomic islands (data file 8) [30]. Also, 353 subsystems comprising of various gene clusters including those associated with bioremediation, environmental adaptation, biocontrol, and plant growth promotion were identified (data file 9) [31]. Two β-lactone smBGCs, both showing low homology (20%) to known smBGCs, were identified. β-lactones are known for their antibiotic, anticancer and antiobesity properties [32]. A siderophore smBGC was identified even though P. fragi is considered a non-siderophore producing member of the genus Pseudomonas [33]. Arylpolyene and NAGGN smBGCs were also identified which, along with the siderophore smBGC, are likely to contribute to the environmental fitness of the strain [34,35,36]. Table 1 provides the links to data files 1–9. We believe the dataset presented in Pseudomonas fragi strain A13BB chromosome, complete genome [39] and in this data note form a sound basis for further in-depth study of the metabolic and biosynthetic capabilities of this strain, and indeed of other closely related species. The dataset also provides useful insights into the molecular mechanisms that underpin these capabilities. Furthermore, being only the fourth publicly available complete genome sequence of P. fragi, the data will enrich the comparative genomics study of the species. Limitations: IslandViewer 4 was run with default parameters. Crucially, IslandPick was run with default comparison genomes; different comparison genomes at different phyletic distances may influence the output of the analysis i.e. number of predicted genomic islands.