Sciweavers

2827 search results - page 197 / 566
» Marking Text Documents
Sort
View
DASFAA
2004
IEEE
135views Database» more  DASFAA 2004»
15 years 8 months ago
Semi-supervised Text Classification Using Partitioned EM
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
Gao Cong, Wee Sun Lee, Haoran Wu, Bing Liu
DRR
2003
15 years 6 months ago
Correcting OCR text by association with historical datasets
The Medical Article Records System (MARS) developed by the Lister Hill National Center for Biomedical Communications uses scanning, OCR and automated recognition and reformatting ...
Susan E. Hauser, Jonathan Schlaifer, Tehseen F. Sa...
RIAO
2007
15 years 6 months ago
Document frequency and term specificity
Document frequency is used in various applications in Information Retrieval and other related fields. An assumption frequently made is that the document frequency represents a lev...
Hideo Joho, Mark Sanderson
CIKM
2009
Springer
15 years 9 months ago
Improving binary classification on text problems using differential word features
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
148
Voted
BMCBI
2008
125views more  BMCBI 2008»
15 years 5 months ago
Disambiguation of biomedical text using diverse sources of information
Background: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the aut...
Mark Stevenson, Yikun Guo, Robert J. Gaizauskas, D...