Search Sciweavers | Sciweavers

66 search results - page 1 / 14

» Measuring the Structural Similarity of Semistructured Docume...

click to vote

VLDB
2007
ACM

93views Database» more VLDB 2007»

Measuring the Structural Similarity of Semistructured Documents Using Entropy

14 years 5 months ago

Download www.vldb.org

We propose a technique for measuring the structural similarity of semistructured documents based on entropy. After extracting the structural information from two documents we use ...

Sven Helmer

claim paper

Read More »

click to vote

SIGIR
2006
ACM

163views Information Technology» more SIGIR 2006»

13 years 10 months ago

Measuring similarity of semi-structured documents with context weights

Download www.ischool.drexel.edu

In this work, we study similarity measures for text-centric XML documents based on an extended vector space model, which considers both document content and structure. Experimenta...

Christopher C. Yang, Nan Liu

claim paper

Read More »

click to vote

KES
2004
Springer

102views Information Technology» more KES 2004»

Knowledge Extraction from Semi-structured Data Based on Fuzzy Techniques

13 years 10 months ago

Download www.crema.unimi.it

Abstract. In this work we propose a fuzzy technique to compare XML documents belonging to a semi-structured flow and sharing a common vocabulary of tags. Our approach is based on t...

Paolo Ceravolo, Maria Cristina Nocerino, Marco Viv...

claim paper

Read More »

click to vote

ICDM
2002
IEEE

162views Data Mining» more ICDM 2002»

Phrase-based Document Similarity Based on an Index Graph Model

13 years 9 months ago

Download pami.uwaterloo.ca

Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...

Khaled M. Hammouda, Mohamed S. Kamel

claim paper

Read More »

click to vote

DIS
2001
Springer

93views Theoretical Computer Science» more DIS 2001»

Eliminating Useless Parts in Semi-structured Documents Using Alternation Counts

13 years 9 months ago

Download www.i.kyushu-u.ac.jp

We propose a preprocessing method for Web mining which, given semi-structured documents with the same structure and style, distinguishes useless parts and non-useless parts in each...

Daisuke Ikeda, Yasuhiro Yamada, Sachio Hirokawa

claim paper

Read More »

« Prev « First page 1 / 14 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers