Although personalized search has been proposed for many years and many personalization strategies have been investigated, it is still unclear whether personalization is consistent...
Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
: The increasing number of digitized texts presently available notably on the Web has developed an acute need in text mining techniques. Clustering systems are used more and more o...
Abdelmalek Amine, Zakaria Elberrichi, Michel Simon...
Search engines process queries conjunctively to restrict the size of the answer set. Further, it is not rare to observe a mismatch between the vocabulary used in the text of Web p...