An edit script for taxonomic classifications

9 years 10 months ago
An edit script for taxonomic classifications
Abstract. Taxonomy provides one of the most powerful ways to navigate sequence data bases but currently, users are forced to formulate queries according to a single taxonomic classification. Given that there is not universal agreement on the classification of organisms, providing a single classification places constraints on the questions biologists can ask. In this paper, we present a solution to the problem of querying sequence data bases using alternative classifications, based on the computation of an edit script that summarises the differences between two classification trees. Our algorithms find the shortest possible edit script based on the identification of all shared subtrees, and only take time quasi linear in the size of the trees because classification trees have unique node labels. These algorithms have been recently implemented, and the software is freely available for download. Keywords. taxonomic classification, edit script, common subtrees 1 Motivation Taxonomy provide...
Roderic D. M. Page, Gabriel Valiente
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2005
Authors Roderic D. M. Page, Gabriel Valiente
Comments (0)