Search Sciweavers | Sciweavers

70 search results - page 13 / 14

» Machine Learning for Information Extraction from XML marked-...

click to vote

WWW
2008
ACM

129views Internet Technology» more WWW 2008»

Automatically refining the wikipedia infobox ontology

14 years 6 months ago

Download www2008.org

The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia inf...

Fei Wu, Daniel S. Weld

claim paper

Read More »

click to vote

AGENTS
1997
Springer

110views Security Privacy» more AGENTS 1997»

A Scalable Comparison-Shopping Agent for the World-Wide Web

13 years 9 months ago

Download www.cs.washington.edu

The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics...

Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld

claim paper

Read More »

click to vote

SIGIR
2008
ACM

162views Information Technology» more SIGIR 2008»

Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization

13 years 5 months ago

Download users.cis.fiu.edu

Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and mach...

Dingding Wang, Tao Li, Shenghuo Zhu, Chris H. Q. D...

claim paper

Read More »

click to vote

WSDM
2010
ACM

204views Data Mining» more WSDM 2010»

Learning URL patterns for webpage de-duplication

14 years 4 days ago

Download www.wsdm-conference.org

Presence of duplicate documents in the World Wide Web adversely aﬀects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...

Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...

claim paper

Read More »

click to vote

AND
2009

128views Machine Learning» more AND 2009»

Digital weight watching: reconstruction of scanned documents

13 years 3 months ago

Download ilps.science.uva.nl

A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...

Tim Gielissen, Maarten Marx

claim paper

Read More »

« Prev « First page 13 / 14 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers