Search Sciweavers | Sciweavers

24 search results - page 4 / 5

» DOM-based content extraction of HTML documents

click to vote

TREC
2008

118views Information Technology» more TREC 2008»

IIT Kharagpur at TREC 2008 Blog Track

13 years 7 months ago

Download trec.nist.gov

This paper describes our opinion retrieval system for TREC 2008 blog track. We focused on five different aspects of the system. The first module is focussed on extracting the blog...

Robin Anil, Sudeshna Sarkar

claim paper

Read More »

click to vote

DOCENG
2009
ACM

148views Document Analysis» more DOCENG 2009»

Deriving image-text document surrogates to optimize cognition

14 years 1 days ago

Download www.ecologylab.net

The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...

Eunyee Koh, Andruid Kerne

claim paper

Read More »

click to vote

WWW
2006
ACM

135views Internet Technology» more WWW 2006»

Relaxed: on the way towards true validation of compound documents

14 years 6 months ago

Download www.medieq.org

To maintain interoperability in the Web environment it is necessary to comply with Web standards. Current specifications of HTML and XHTML languages define conformance conditions ...

Jirka Kosek, Petr Nálevka

claim paper

Read More »

click to vote

WWW
2009
ACM

213views Internet Technology» more WWW 2009»

Extracting article text from the web with maximum subsequence segmentation

14 years 6 months ago

Download www2009.org

Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...

Jeff Pasternack, Dan Roth

claim paper

Read More »

click to vote

ITCC
2005
IEEE

105views Information Technology» more ITCC 2005»

Elimination of Redundant Information for Web Data Mining

13 years 11 months ago

Download eprints.utas.edu.au

These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...

Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang

claim paper

Read More »

« Prev « First page 4 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers