Sciweavers

935 search results - page 160 / 187
» Analyzing Document Retrievability in Patent Retrieval Settin...
Sort
View
DASFAA
2008
IEEE
188views Database» more  DASFAA 2008»
15 years 6 months ago
Summarization Graph Indexing: Beyond Frequent Structure-Based Approach
Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph ...
Lei Zou, Lei Chen 0002, Huaming Zhang, Yansheng Lu...
WWW
2008
ACM
16 years 14 days ago
Performance of compressed inverted list caching in search engines
Due to the rapid growth in the size of the web, web search engines are facing enormous performance challenges. The larger engines in particular have to be able to process tens of ...
Jiangong Zhang, Xiaohui Long, Torsten Suel
WWW
2005
ACM
16 years 14 days ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
WWW
2010
ACM
15 years 6 months ago
A scalable machine-learning approach for semi-structured named entity recognition
Named entity recognition studies the problem of locating and classifying parts of free text into a set of predefined categories. Although extensive research has focused on the de...
Utku Irmak, Reiner Kraft
WWW
2010
ACM
15 years 6 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...