Search Sciweavers | Sciweavers

78 search results - page 15 / 16

» On The Closest String and Substring Problems

170

click to vote

GFKL
2005
Springer

142views Data Mining» more GFKL 2005»

16 years 16 days ago

Near Similarity Search and Plagiarism Analysis

Download www.uni-weimar.de

Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...

Benno Stein, Sven Meyer zu Eissen

claim paper

Read More »

150

Voted

LREC
2010

154views Education» more LREC 2010»

Information Retrieval of Word Form Variants in Spoken Language Corpora Using Generalized Edit Distance

15 years 8 months ago

Download www.lrec-conf.org

An important feature of spoken language corpora is existence of different spelling variants of words in transcription. So there is an important problem for linguist who works with...

Siim Orasmaa, Reina Käärik, Jaak Vilo, T...

claim paper

Read More »

184

click to vote

ALGORITHMICA
1999

112views more ALGORITHMICA 1999»

Suffix Trees on Words

15 years 6 months ago

Download www.larsson.dogma.net

We discuss an intrinsic generalization of the suffix tree, designed to index a string of length n which has a natural partitioning into m multicharacter substrings or words. This ...

Arne Andersson, N. Jesper Larsson, Kurt Swanson

claim paper

Read More »

153

click to vote

COLING
1996

82views Computational Linguistics» more COLING 1996»

A Statistical Method for Extracting Uninterrupted and Interrupted Collocations from Very Large Corpora

15 years 8 months ago

Download acl.ldc.upenn.edu

In order to extractrigidexpressions with a high frequency of use, new algorithm that can efficientlyextract both uninterruptedand interruptedcollocationsfrom very large corpora ha...

Satoru Ikehara, Satoshi Shirai, Hajime Uchino

claim paper

Read More »

302

Voted

FUN
2010
Springer

312views Algorithms» more FUN 2010»

On Table Arrangements, Scrabble Freaks, and Jumbled Pattern Matching

15 years 11 months ago

Download www.i3s.unice.fr

Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a “jumbled string”) in the tex...

Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, ...

claim paper

Read More »

« Prev « First page 15 / 16 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers