Applying syntactic similarity algorithms for enterprise information management

14 years 5 months ago

Download www.hpl.hp.com

: ? Applying Syntactic Similarity Algorithms for Enterprise Information Management Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey III, Joseph Tucek, Alistair Veitch HP Laboratories HPL-2009-90 syntActic similarity, enterprise information management, performance modeling, shingling algorithms, content-based chunking algorithms. For implementing content management solutions and enabling new applications associated with data retention, regulatory compliance, and litigation issues, enterprises need to develop advanced analytics to uncover relationships among the documents, e.g., content similarity, provenance, and clustering. In this paper, we evaluate the performance of four syntactic similarity algorithms. Three algorithms are based on Broder's "shingling" technique while the fourth algorithm employs a more recent approach, "content-based chunking". For our experiments, we use a specially designed corpus of documents that includes a set of "similar"...

Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey

Real-time Traffic

15th Acm Sigkdd | Content-based Chunking Algorithms | Data Mining | KDD 2009 | Syntactic Similarity Algorithms |

claim paper

» Evaluating the Effectiveness of Information Extraction in RealWorld Storage Management

» Management of requirements in ERP development a comparison between proprietary and open so...

» Aligning Ontologies and Evaluating Concept Similarities

» Learning to Classify Biomedical Terms Through Literature Mining and Genetic Algorithms

» InfoAnalyzer a computeraided tool for building enterprise taxonomies

» DocuBrowse faceted searching browsing and recommendations in an enterprise context

» Generating correct EPCs from configured CEPCs

» Information Technology Fashions Lifecycle Phase Analysis

Post Info
More Details (n/a)

Added	25 Nov 2009
Updated	25 Nov 2009
Type	Conference
Year	2009
Where	KDD
Authors	Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey, Joseph Tucek, Alistair C. Veitch

Comments (0)

Sciweavers

Applying syntactic similarity algorithms for enterprise information management

15th Acm Sigkdd | Content-based Chunking Algorithms | Data Mining | KDD 2009 | Syntactic Similarity Algorithms |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers