Sciweavers

8795 search results - page 237 / 1759
» Measuring Generality of Documents
Sort
View
CIKM
2010
Springer
15 years 3 months ago
Improved index compression techniques for versioned document collections
Current Information Retrieval systems use inverted index structures for efficient query processing. Due to the extremely large size of many data sets, these index structures are u...
Jinru He, Junyuan Zeng, Torsten Suel
145
Voted
ESEM
2008
ACM
15 years 6 months ago
Evaluation of capture-recapture models for estimating the abundance of naturally-occurring defects
Project managers can use capture-recapture models to manage the inspection process by estimating the number of defects present in an artifact and determining whether a reinspectio...
Gursimran Singh Walia, Jeffrey C. Carver
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
16 years 5 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
ICDAR
2009
IEEE
15 years 11 months ago
Scalable Feature Extraction from Noisy Documents
We cope with the metadata recognition in layoutoriented documents. We address the problem as a classification task and propose a method for automatic extraction of relevant featu...
Loïc Lecerf, Boris Chidlovskii
IJCNN
2006
IEEE
15 years 11 months ago
A Self-Organising Map Approach for Clustering of XML Documents
— The number of XML documents produced and available on the Internet is steadily increasing. It is thus important to devise automatic procedures to extract useful information fro...
Francesca Trentini, Markus Hagenbuchner, Alessandr...