Sciweavers

6 search results - page 1 / 2
» Identifying and expanding titles in web texts
Sort
View
DOCENG
2008
ACM
13 years 6 months ago
Identifying and expanding titles in web texts
In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedu...
Clémentine Adam, Estelle Delpech, Patrick S...
ANLP
1997
190views more  ANLP 1997»
13 years 6 months ago
Disambiguation of Proper Names in Text
Identifying the occurrences of proper names in text and the entities they refer to can be a difficult task because of the manyto-many mapping between names and their referents. We...
Nina Wacholder, Yael Ravin, Misook Choi
FLAIRS
2007
13 years 7 months ago
Lexicon Development and POS Tagging Using a Tagged Bengali News Corpus
Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing(NLP) application areas. The rapid development of these resources...
Asif Ekbal, Sivaji Bandyopadhyay
SIGIR
2004
ACM
13 years 10 months ago
Constructing a text corpus for inexact duplicate detection
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Jack G. Conrad, Cindy P. Schriber
WWW
2011
ACM
12 years 11 months ago
Web scale NLP: a case study on url word breaking
This paper uses the URL word breaking task as an example to elaborate what we identify as crucialin designingstatistical natural language processing (NLP) algorithmsfor Web scale ...
Kuansan Wang, Christopher Thrasher, Bo-June Paul H...