Identification of two distinct phylogenomic lineages and model strains for the understudied cystic fibrosis lung pathogen Burkholderia multivorans

Burkholderia multivorans is the dominant Burkholderia pathogen recovered from lung infection in people with cystic fibrosis. However, as an understudied pathogen there are knowledge gaps in relation to its population biology, phenotypic traits and useful model strains. A phylogenomic study of B. multivorans was undertaken using a total of 283 genomes, of which 73 were sequenced and 49 phenotypically characterized as part of this study. Average nucleotide identity analysis (ANI) and phylogenetic alignment of core genes demonstrated that the B. multivorans population separated into two distinct evolutionary clades, defined as lineage 1 (n=58 genomes) and lineage 2 (n=221 genomes). To examine the population biology of B. multivorans , a representative subgroup of 77 B. multivorans genomes (28 from the reference databases and the 49 novel short-read genome sequences) were selected based on multilocus sequence typing (MLST), isolation source and phylogenetic placement criteria. Comparative genomics was used to identify B. multivorans lineage-specific genes – ghrB_1 in lineage 1 and glnM_2 in lineage 2 – and diagnostic PCRs targeting them were successfully developed. Phenotypic analysis of 49 representative B. multivorans strains showed considerable inter-strain variance, but the majority of the isolates tested were motile and capable of biofilm formation. A striking absence of B. multivorans protease activity in vitro was observed, but no lineage-specific phenotypic differences were demonstrated. Using phylogenomic and phenotypic criteria, three model B. multivorans CF strains were identified, BCC0084 (lineage 1), BCC1272 (lineage 2a) and BCC0033 lineage 2b, and their complete genome sequences determined. B. multivorans CF strains BCC0033 and BCC0084, and the environmental reference strain, ATCC 17616, were all capable of short-term survival within a murine lung infection model. By mapping the population biology, identifying lineage-specific PCRs and model strains, we provide much needed baseline resources for future studies of B. multivorans .


Figure S2 .
Figure S2.Growth parameters of the B. multivorans strains (n = 50) at 48 hours.The box plots have been drawn from data produced in R using the GroFit package: (A) growth rate (h -1 ), (B) lag phase (hours), and (C) maximum growth (OD 480-520 nm).All box plots are annotated with the model strains on the right-hand side (red) and display the upper quartile, mean, and lower quartile.Outliers have also been added to the plots.Each parameter has been compared by B. multivorans lineage and no statistically significant B. multivorans lineage-specific differences in these growth parameters were identified (p > 0.05 for comparison of mean for all plots).

Figure S3 .
Figure S3.Swimming and swarming motility in the B. multivorans strain panel after 24-hours incubation at 37 o C. (A) Swimming motility was performed on 0.3% LB agar (n = 49).Swarming motility was performed on (B) 0.5% LB agar (n = 48) and (C) 0.5% BSM-G agar (n = 44).Model B. multivorans strains (n = 4) are annotated in red on the righthand side of each box plot.No motility associations statistically linked to each B. multivorans lineage were identified.

Figure S5 .
Figure S5.Biofilm formation of the B. multivorans strain panel (n = 49) after 24-hours.Among of biofilm was assessed using the crystal violet assay, reading the results using a plate reader at 600nm.Biofilm controls were B. multivorans ATCC 17616 for the 'high' former and BCC0010 (blue) for the 'low' control.Model CF B. multivorans strains are noted in red on the right-hand side of the box plot.Box plots show the mean, lower quartile, and upper quartile for biofilm staining.Outliers are also shown.No biofilm formation abilities statistically linked to each B. multivorans lineage were identified.

Figure S6 .
Figure S6.Comparison of biofilm formation between the B. multivorans lineages.B. multivorans ATCC 17616 was used as a high biofilm control and included in all the lineage comparisons made using the B. multivorans sub-panel (n=49 strains compared).The box plots show: (A) Comparison of biofilm formation between lineage 1 (n = 17) and lineage 2 (n = 30); (B) Comparison of biofilm formation between lineage 1 (n = 17), lineage 2a (n = 11) and lineage 2b (n = 19).Statistical analysis was performed in R using a non-parametric Kruskal-Wallis test for two comparisons or Dunn-Test and Benjamini-Hochberg test for three comparisons.No significance differences in biofilm formation were observed at p = 0.05 as indicated (ns).

Table S3 . PCR primer sequences designed for all four B. multivorans lineage-specific target genes
a Position relative to the BCC0084 (lineage 1) complete genome b Position relative to the ATCC 17616 (lineage 2a) complete genome c Mismatches for each primer sequence are highlighted in bold.Red indicates a mismatch in all strains of the opposing lineage and blue indicates mismatches within target lineage strains

Table S5 . Swimming and swarming motility within the B. multivorans strains Strain Mean motility zone (mm) within each growth medium and motility phenotype b,c
B. multivorans ATCC 17616 was used as a positive motile control for all assays b Highly motile strains (≤50 mm average motility diameter) are highlighted in blue c ND = mean not determined due to variable results a