Sciweavers

371 search results - page 55 / 75
» World Wide Web - A Multilingual Language Resource
Sort
View
EMNLP
2004
14 years 12 months ago
Monolingual Machine Translation for Paraphrase Generation
We apply statistical machine translation (SMT) tools to generate novel paraphrases of input sentences in the same language. The system is trained on large volumes of sentence pair...
Chris Quirk, Chris Brockett, William B. Dolan
JIDM
2010
90views more  JIDM 2010»
14 years 5 months ago
A Context-Dependent Supervised Learning Approach to Sentiment Detection in Large Textual Databases
Sentiment detection automatically identifies emotions in textual data. The increasing amount of emotive documents available in corporate databases and on the World Wide Web calls f...
Albert Weichselbraun, Stefan Gindl, Arno Scharl
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 5 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
VL
2009
IEEE
156views Visual Languages» more  VL 2009»
15 years 5 months ago
Non-programmers identifying functionality in unfamiliar code: Strategies and barriers
Source code on the web is a widely available and potentially rich learning resource for nonprogrammers. However, unfamiliar code can be daunting to end-users without programming e...
Paul A. Gross, Caitlin Kelleher
WWW
2010
ACM
15 years 5 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han