Sciweavers

1950 search results - page 273 / 390
» Informative sampling for large unbalanced data sets
Sort
View
111
Voted
INFOVIS
1999
IEEE
15 years 7 months ago
Sensemaking of Evolving Web Sites Using Visualization Spreadsheets
In the process of knowledge discovery, workers examine available information in order to make sense of it. By sensemaking, we mean interacting with and operating on the informatio...
Ed Huai-hsin Chi, Stuart K. Card
CIKM
2009
Springer
15 years 7 months ago
Robust record linkage blocking using suffix arrays
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
138
Voted
LREC
2010
139views Education» more  LREC 2010»
15 years 5 months ago
Creation of Lexical Resources for a Characterisation of Multiword Expressions in Italian
The theoretical characterisation of multiword expressions (MWEs) is tightly connected to their actual occurrences in data and to their representation in lexical resources. We pres...
Andrea Zaninello, Malvina Nissim
126
Voted
ICML
1996
IEEE
16 years 4 months ago
Toward Optimal Feature Selection
In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for de ning the theoretically optimal, but computationally intr...
Daphne Koller, Mehran Sahami
125
Voted
WWW
2007
ACM
16 years 4 months ago
Multi-factor clustering for a marketplace search interface
Search engines provide a small window to the vast repository of data they index and against which they search. They try their best to return the documents that are of relevance to...
Neel Sundaresan, Kavita Ganesan, Roopnath Grandhi