Sciweavers

843 search results - page 99 / 169
» Segmentation of Compressed Documents
Sort
View
SPIRE
2010
Springer
14 years 8 months ago
Dual-Sorted Inverted Lists
Several IR tasks rely, to achieve high efficiency, on a single pervasive data structure called the inverted index. This is a mapping from the terms in a text collection to the docu...
Gonzalo Navarro, Simon J. Puglisi
ICDAR
2003
IEEE
15 years 3 months ago
A Segmentation Method for Bibliographic References by Contextual Tagging of Fields
In this paper, a method based on part-of-speech tagging (PoS) is used for bibliographic reference structure. This method operates on a roughly structured ASCII file, produced by O...
Dominique Besagni, Abdel Belaïd, Nelly Benet
IPM
2008
141views more  IPM 2008»
14 years 10 months ago
Towards a unified approach to document similarity search using manifold-ranking of blocks
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
Xiaojun Wan, Jianwu Yang, Jianguo Xiao
WWW
2007
ACM
15 years 10 months ago
A search-based Chinese word segmentation method
In this paper, we propose a novel Chinese word segmentation method which leverages the huge deposit of Web documents and search technology. It simultaneously solves ambiguous phra...
Xin-Jing Wang, Yong Qin, Wen Liu
BMCBI
2007
147views more  BMCBI 2007»
14 years 10 months ago
Comparative analysis of long DNA sequences by per element information content using different contexts
Background: Features of a DNA sequence can be found by compressing the sequence under a suitable model; good compression implies low information content. Good DNA compression mode...
Trevor I. Dix, David R. Powell, Lloyd Allison, Jul...