Sciweavers

523 search results - page 9 / 105
» Metric Learning for Text Documents
Sort
View
FLAIRS
2001
15 years 21 days ago
Extracting Partial Structures from HTML Documents
The new wrapper model for extractiong text data from HTML documents is introduced. The Kushmerick's wrapper class (Kusshmerick 2000) may be unsuccessful in the case that suff...
Hiroshi Sakamoto, Yoshitsugu Murakami, Hiroki Arim...
MLDM
2007
Springer
15 years 5 months ago
PE-PUC: A Graph Based PU-Learning Approach for Text Classification
This paper presents a novel solution for the problem of building text classifier using positive documents (P) and unlabeled documents (U). Here, the unlabeled documents are mixed w...
Shuang Yu, Chunping Li
WSDM
2010
ACM
261views Data Mining» more  WSDM 2010»
15 years 8 months ago
Learning Similarity Metrics for Event Identification in Social Media
Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host ...
Hila Becker, Mor Naaman, Luis Gravano
ICML
2001
IEEE
16 years 4 days ago
Learning to Select Good Title Words: An New Approach based on Reverse Information Retrieval
In this paper, we show how we can learn to select good words for a document title. We view the problem of selecting good title words for a document as a variant of an Information ...
Rong Jin, Alexander G. Hauptmann
IPM
2006
64views more  IPM 2006»
14 years 11 months ago
Text mining without document context
We consider a challenging clustering task: the clustering of muti-word terms without document co-occurrence information in order to form coherent groups of topics. For this task, ...
Eric SanJuan, Fidelia Ibekwe-Sanjuan