Sciweavers

12 search results - page 2 / 3
» Rules of Thumb for Information Acquisition from Large and Re...
Sort
View
IEAAIE
2001
Springer
13 years 10 months ago
Selecting a Relevant Set of Examples to Learn IE-Rules
The growing availability of online text has lead to an increase in the use of automatic knowledge acquisition approaches from textual data, as in Information Extraction (IE). Some ...
Jordi Turmo, Horacio Rodríguez
WWW
2010
ACM
14 years 1 months ago
A pattern tree-based approach to learning URL normalization rules
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
SPLST
2003
13 years 7 months ago
Compacting XML Documents
Abstract. Nowadays one of the most common formats for storing information is XML. The size of XML documents can be rather large, and they may contain redundant attributes which can...
Miklós Kálmán, Ferenc Havasi,...
IJFCS
2008
102views more  IJFCS 2008»
13 years 6 months ago
Succinct Minimal Generators: Theoretical Foundations and Applications
In data mining applications, highly sized contexts are handled what usually results in a considerably large set of frequent itemsets, even for high values of the minimum support t...
Tarek Hamrouni, Sadok Ben Yahia, Engelbert Mephu N...
BMCBI
2008
146views more  BMCBI 2008»
13 years 6 months ago
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries
Background: Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as U...
Robin P. Smith, William J. Buchser, Marcus B. Lemm...