In silico prediction of UTR repeats using clustered EST data

15 years 11 months ago

Download www.plant-biotech.net

Clustering of EST data is a method for the non-redundant representation of an organisms transcriptome. During clustering of large amounts of EST data, usually some large clusters (>500 sequences) are created. Those can lead to iterative contig builds, consumation of lots of computing time and improbable exon alignments, which is unfavourable. In addition, these clusters sometimes contain transcripts for more than one gene, which is not desired. Such large clusters come into existence due to: (1) large numbers of identical ESTs / high transcript levels; (2) large gene families with highly similar members; (3) false clustering due to a) unremoved vector or rRNA sequences, b) undetected cloning artifacts or c) repetitive elements in UTRs. During pre-processing (filtering and masking) of the sequence raw data, contaminations such as vector or linker sequences as well as bacterial genes are being removed (clipping). In the same process, it is essential to mask repetitive elements in ord...

Stefan A. Rensing, Daniel Lang, Ralf Reski

Real-time Traffic

EST Data | False Clustering | GCB 2003 | Repetitive Elements |

claim paper

Post Info
More Details (n/a)

Added	06 Jul 2010
Updated	06 Jul 2010
Type	Conference
Year	2003
Where	GCB
Authors	Stefan A. Rensing, Daniel Lang, Ralf Reski

Comments (0)

Sciweavers

In silico prediction of UTR repeats using clustered EST data

EST Data | False Clustering | GCB 2003 | Repetitive Elements |

Explore & Download

Productivity Tools

Sciweavers