Sciweavers

4578 search results - page 690 / 916
» Learning from Multi-source Data
Sort
View
HAIS
2011
Springer
14 years 9 months ago
Clustering Ensemble for Spam Filtering
One of the main problems that modern e-mail systems face is the management of the high degree of spam or junk mail they recieve. Those systems are expected to be able to distinguis...
Santiago Porras, Bruno Baruque, Belén Vaque...
ACL
2007
15 years 7 months ago
Multilingual Transliteration Using Feature based Phonetic Method
In this paper we investigate named entity transliteration based on a phonetic scoring method. The phonetic method is computed using phonetic features and carefully designed pseudo...
Su-Youn Yoon, Kyoung-Young Kim, Richard Sproat
SSDBM
2005
IEEE
100views Database» more  SSDBM 2005»
15 years 12 months ago
An Information Theoretic Model for Database Alignment
As with many large organizations, the Government's data is split in many different ways and is collected at different times by different people. The resulting massive data he...
Patrick Pantel, Andrew Philpot, Eduard H. Hovy
DASFAA
2004
IEEE
135views Database» more  DASFAA 2004»
15 years 10 months ago
Semi-supervised Text Classification Using Partitioned EM
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
Gao Cong, Wee Sun Lee, Haoran Wu, Bing Liu
NIPS
1997
15 years 7 months ago
EM Algorithms for PCA and SPCA
I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large col...
Sam T. Roweis