Affordable and widely-available sequencing technology continues to generate data that can be used to detect new and ongoing outbreaks on a global scale, providing greater insight into particular bacterial populations. New tools should be developed to harness this influx of data and increase our resolution in monitoring the global expansions of infectious disease.
Recently, the population genetic statistic Fu’s Fs identified a gene, porA, that was not only driving the expansion of a drug-resistant clone of Campylobacter jejuni, but was also necessary and sufficient for causing abortion in sheep (Wu, Z. et al. 2016). This analysis was applied after bacterial sexual genetics in a guinea pig model had also identified porA and suggested that population genetic statistics could be used to elucidate particular loci driving an outbreak.
We will use pandemic Escherichia coli ST131, a clone associated with high antibiotic resistance, as an independent validation. Multiple hypotheses have been generated to explain the success of this well-studied clone, making it a perfect candidate for our analysis. We believe that, in the process of understanding the use of these population genetic statistics for bacteria, we can also identify specific loci under selection for ST131 in an unbiased manner.
I will discuss Fu’s Fs and related statistics applied to E. coli ST131, and the resulting preliminary data and challenges. For instance, horizontal gene transfer and recombination, diversity mechanisms common in bacteria, violate assumptions for some of these statistics. As these methods have rarely been applied to large bacterial datasets, we must also understand how the diversity and size of the studied population affect our ability to detect these evolutionary signals.
This project will provide a set of new approaches for understanding bacterial genomic data, improving our understanding of infectious disease and its spread on a global and local scale.