Sciweavers

1950 search results - page 160 / 390
» Informative sampling for large unbalanced data sets
Sort
View
143
Voted
COLING
2002
15 years 3 months ago
A Maximum Entropy-based Word Sense Disambiguation System
In this paper, a supervised learning system of word sense disambiguation is presented. It is based on conditional maximum entropy models. This system acquires the linguistic knowl...
Armando Suárez, Manuel Palomar
127
Voted
AND
2010
15 years 1 months ago
Document: a useful level for facing noisy data
In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at ...
Hervé Déjean, Jean-Luc Meunier
118
Voted
WEBDB
1998
Springer
96views Database» more  WEBDB 1998»
15 years 7 months ago
Extracting Patterns and Relations from the World Wide Web
The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists maybe scattered across thous...
Sergey Brin
DIS
2001
Springer
15 years 8 months ago
Dynamic Aggregation to Support Pattern Discovery: A Case Study with Web Logs
Rapid growth of digital data collections is overwhelming the capabilities of humans to comprehend them without aid. The extraction of useful data from large raw data sets is someth...
Lida Tang, Ben Shneiderman
98
Voted
ICDAR
2003
IEEE
15 years 9 months ago
Form Reading based on Form-type Identification and Form-data Recognition
Form reading technology based on form-type identification and form-data recognition is proposed. This technology can solve difficulties in variety for reading different items on f...
Hiroshi Sako, Minenobu Seki, Naohiro Furukawa, His...