About me

My photo
Liverpool, United Kingdom
I am interested in how we can use DNA sequences to understand biodiversity – how do we recognise species, and how are species related at taxonomic, ecological and geographic levels? My passion for biodiversity research has led me from the world’s largest natural history collection - Natural History Museum, London, where I completed my MSc, to the Biodiversity Institute of Ontario - global centre for the international Barcode of Life, as a PhD student, and to the hyper-diverse tropics of Southeast Asia. The tropics will be the first regions to experience historically unprecedented climates and this will happen within the next decade. Consequently my recent research has focussed on understanding the effects of urbanisation and climate change on tropical and subtropical biodiversity - encompassing both species richness and ecological integrity across a diversity of taxonomic groups.

Feb 16, 2010

Assigning unknowns to higher taxa using DNA barcodes: A case study in Sphingidae

When a specimen belongs to a species not yet represented in a DNA barcode reference library, there is considerable disagreement in the literature over the effectiveness of using sequence comparisons to accurately assign the query to a higher-taxon. Library species richness has been proposed as a critical factor affecting the accuracy of such assignments, but was never thoroughly investigated.

We explored the accuracy of assignments to genus, tribe and subfamily of 118 query species with five different assignment criteria, one distance-based and four tree-based. These Costa Rican species belong to Sphingidae; a family for which there is an almost complete DNA barcode reference library. An automated procedure was used to simulate different levels of species richness (10 to 100% of the available species) in reference libraries, and to record assignments (positive or ambiguous) and their accuracy (true or false based on current classification) under the five criteria.

Using a liberal tree-based criterion, an average of 80% of queries were accurately assigned a genus name with libraries containing 20% of available species, while 87% were accurately assigned a genus name with a library containing all available species. The liberal tree-based criterion assigned an average of 74% of queries accurately to tribes and an average of 90% accurately to subfamilies, across all libraries. The results suggest that the species richness of the reference library had only a weak effect on assignment accuracy, whereas which assignment criterion was used had a strong effect. Additional parameters in the tree-based criteria, such as exclusivity of taxa, decreased the number of false positive assignments, but also increased the number of false ambiguous assignments. Our findings suggest that barcode reference libraries can be useful for higher-taxon assignments long before the libraries achieve complete species richness.