Reference Databases
Refers to libraries of DNA sequences (usually from barcode genes) that have been generated from species of known identity.
Sequences from unidentified organisms – obtained either by Sanger sequencing or high-throughput sequencing – are compared against a reference database to make species identifications.
Databases can be curated (e.g. the Barcode of Life Database – BOLD – www.boldsystems.org) or un-curated (e.g. Genbank – www.ncbi.nlm.nih.gov).
In curated databases, identifications are scrutinised and verified; in un-curated databases they are not. GenBank is therefore far more extensive than BOLD, but contains many errors.