Sciweavers

3180 search results - page 303 / 636
» Knowledge-based Document Analysis
Sort
View
PLDI
2010
ACM
16 years 3 months ago
A Context-free Markup Language for Semi-structured Text
An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
Qian Xi, David Walker
EMNLP
2008
15 years 7 months ago
An Analysis of Active Learning Strategies for Sequence Labeling Tasks
Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed ...
Burr Settles, Mark Craven
ICTIR
2009
Springer
15 years 3 months ago
An Analysis of NP-Completeness in Novelty and Diversity Ranking
Abstract. A useful ability for search engines is to be able to rank objects with novelty and diversity: the top k documents retrieved should cover possible interpretations of a que...
Ben Carterette
GFKL
2005
Springer
142views Data Mining» more  GFKL 2005»
15 years 11 months ago
Near Similarity Search and Plagiarism Analysis
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...
Benno Stein, Sven Meyer zu Eissen
IPM
2007
145views more  IPM 2007»
15 years 6 months ago
Text mining techniques for patent analysis
Patent documents contain important research results. However, they are lengthy and rich in technical terminology such that it takes a lot of human efforts for analyses. Automatic...
Yuen-Hsien Tseng, Chi-Jen Lin, Yu-I Lin