Sciweavers

502 search results - page 53 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
DIS
2001
Springer
15 years 3 months ago
Eliminating Useless Parts in Semi-structured Documents Using Alternation Counts
We propose a preprocessing method for Web mining which, given semi-structured documents with the same structure and style, distinguishes useless parts and non-useless parts in each...
Daisuke Ikeda, Yasuhiro Yamada, Sachio Hirokawa
ICDAR
2009
IEEE
14 years 9 months ago
A New Method for Writer Identification of Handwritten Farsi Documents
Most studies about writer identification are based on English documents and to our knowledge no research has been reported on Farsi or Arabic documents. In this paper, we have pro...
F. Shahabi, M. Rahmati
SIGIR
1999
ACM
15 years 3 months ago
Deriving Concept Hierarchies from Text
This paper presents a means of automatically deriving a hierarchical organization of concepts from a set of documents without use of training data or standard clustering technique...
Mark Sanderson, W. Bruce Croft
BMCBI
2007
176views more  BMCBI 2007»
14 years 11 months ago
The Firegoose: two-way integration of diverse data from different bioinformatics web resources with desktop applications
Background: Information resources on the World Wide Web play an indispensable role in modern biology. But integrating data from multiple sources is often encumbered by the need to...
J. Christopher Bare, Paul T. Shannon, Amy K. Schmi...
WWW
2003
ACM
15 years 12 months ago
The XML web: a first study
Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML w...
Laurent Mignet, Denilson Barbosa, Pierangelo Veltr...