Automated Clustering and Assembly of Large EST Collections

9 years 21 days ago
Automated Clustering and Assembly of Large EST Collections
The avMlability of large EST(Expressed Sequence Tag)databases has led to a revolution in the waynew genes are cloned. Difficulties arise, however,due to high error rates and redundancy of raw ESTdata. Forthese reasons, one of the first tasks performedby a scientist investigating any ESTof interest is to gather contiguous ESTsand assembletheminto a larger virtuai cDNA.The REX(Recursive ESTextender) algorithmdescribed in this paper completely automates this process by finding ESTsthat can be clustered on the basis of overlapping bases, and then assembhngthe contigs into a consensussequence. Bycombining the clustering and assembly steps, REXcan quickly generate assemblies from ESTdatabases that are frequently updated without having to preprocess the data. Aconsensusassemblymethodis used to correct miscalled bases and removeindel errors. Aunique feature of this methodis that it addressesthe issues of splice variants and unspliced cDNAdata. Since REX is a fast greedy algorithm, it can addr...
David P. Yee, Darrell Conklin
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 1998
Where ISMB
Authors David P. Yee, Darrell Conklin
Comments (0)