Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Background: Sequence comparison faces new challenges today, with many complete genomes and large libraries of transcripts known. Gene annotation pipelines match these sequences in...
Background: The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functiona...
Background: Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. Howeve...
Paolo Ferragina, Raffaele Giancarlo, Valentina Gre...