Sciweavers

3371 search results - page 146 / 675
» Using parsimonious language models on web data
Sort
View
IC
2003
15 years 5 months ago
Internet Collaboration Using the W3C Document Object Model
The Internet makes it possible to share information (e.g. text, image, audio, video and other formats of data) across the globe. In this paper we look at collaborative Internet en...
Xiaohong Qiu, Bryan Carpenter, Geoffrey Fox
ICASSP
2009
IEEE
15 years 10 months ago
Gaussian Backend design for open-set language detection
This paper proposes a new approach to the challenging open-set language detection task. Most state-of-the-art approaches make use of data sources with several out-of-set languages...
Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori ...
WWW
2010
ACM
15 years 11 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
TREC
2007
15 years 5 months ago
Query and Document Models for Enterprise Search
: We describe our participation in the TREC 2007 Enterprise track and detail our language modeling-based approaches. For document search, our focus was on estimating a mixture mode...
Krisztian Balog, Katja Hofmann, Wouter Weerkamp, M...
WWW
2007
ACM
16 years 4 months ago
Explorations in the use of semantic web technologies for product information management
Master data refers to core business entities a company uses repeatedly across many business processes and systems (such as lists or hierarchies of customers, suppliers, accounts, ...
Chen Wang, Daniel C. Wolfson, Jean-Sébastie...