Sciweavers

GCB
2003
Springer

In silico prediction of UTR repeats using clustered EST data

13 years 9 months ago
In silico prediction of UTR repeats using clustered EST data
Clustering of EST data is a method for the non-redundant representation of an organisms transcriptome. During clustering of large amounts of EST data, usually some large clusters (>500 sequences) are created. Those can lead to iterative contig builds, consumation of lots of computing time and improbable exon alignments, which is unfavourable. In addition, these clusters sometimes contain transcripts for more than one gene, which is not desired. Such large clusters come into existence due to: (1) large numbers of identical ESTs / high transcript levels; (2) large gene families with highly similar members; (3) false clustering due to a) unremoved vector or rRNA sequences, b) undetected cloning artifacts or c) repetitive elements in UTRs. During pre-processing (filtering and masking) of the sequence raw data, contaminations such as vector or linker sequences as well as bacterial genes are being removed (clipping). In the same process, it is essential to mask repetitive elements in ord...
Stefan A. Rensing, Daniel Lang, Ralf Reski
Added 06 Jul 2010
Updated 06 Jul 2010
Type Conference
Year 2003
Where GCB
Authors Stefan A. Rensing, Daniel Lang, Ralf Reski
Comments (0)