Sciweavers

385 search results - page 27 / 77
» Improving data mining utility with projective sampling
Sort
View
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
15 years 10 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen
BMCBI
2006
144views more  BMCBI 2006»
14 years 9 months ago
Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences
Background: The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation...
Shih-Hau Chiu, Chien-Chi Chen, Gwo-Fang Yuan, Thy-...
ICMLA
2008
14 years 11 months ago
Highly Scalable SVM Modeling with Random Granulation for Spam Sender Detection
Spam sender detection based on email subject data is a complex large-scale text mining task. The dataset consists of email subject lines and the corresponding IP address of the em...
Yuchun Tang, Yuanchen He, Sven Krasser
DKE
2007
95views more  DKE 2007»
14 years 9 months ago
Warping the time on data streams
Continuously monitoring through time the correlation/distance of multiple data streams is of interest in a variety of applications, including financial analysis, video surveillanc...
Paolo Capitani, Paolo Ciaccia
CIKM
2004
Springer
15 years 3 months ago
Optimizing web search using web click-through data
The performance of web search engines may often deteriorate due to the diversity and noisy information contained within web pages. User click-through data can be used to introduce...
Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Yong Yu, W...