Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

12

FLAIRS
2006

favoriteEmaildiscussreport

134views Artificial Intelligence» more FLAIRS 2006»

Corpus Based Unsupervised Labeling of Documents

13 years 6 months ago

Corpus Based Unsupervised Labeling of Documents

Download www.aaai.org

Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of documents from sources like the World Wide Web, manual labeling by domain experts is prohibitively expensive. The problem of reducing effort in labeling of documents has warranted a lot of investigation in the past. Most of this work involved some kind of supervised or semisupervised learning. This motivates the need to find automatic methods for annotating documents with labels. In this work we explore a novel method of assigning labels to documents without using any training data. The proposed method uses clustering to build semantically related sets that are used as candidate labels to documents. This technique could be used for labeling large corpora in an unattended fashion.

Delip Rao, Deepak P, Deepak Khemani

Real-time Traffic

Artificial Intelligence | Categorization Involves Mapping | Documents | FLAIRS 2006 | Large Corpora |

claim paper

Related Content

» Using Corpus Analysis to Inform Research into Opinion Detection in Blogs

» Collective CrossDocument Relation Extraction Without Labelled Data

» Distant supervision for relation extraction without labeled data

» Classification of Documents Based on the Structure of Their DOM Trees

» Towards multispeaker unsupervised speech pattern discovery

» Combining Statistical Techniques and Lexicosyntactic Patterns for Semantic Relations Extra...

» Document Summarization Using Conditional Random Fields

» A SemiSupervised Document Clustering Technique for Information Organization

» A Topic Model for Linked Documents and Update Rules for its Estimation

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2006
Where	FLAIRS
Authors	Delip Rao, Deepak P, Deepak Khemani

Comments (0)