We present two methods for estimating replacement probabilities without using parallel corpora. The first method proposed exploits the possible translation probabilities latent in ...
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better c...
The semi-structured information available in HTML and similar documents provide valuable information that can be used for information extraction applications. This information tog...
This paper introduces a novel method for learning a wrapper for extraction of information from web pages, based upon (k,l)-contextual tree languages. It also introduces a method to...
Stefan Raeymaekers, Maurice Bruynooghe, Jan Van de...
Record linkage refers to techniques for identifying records associated with the same real-world entities. Record linkage is not only crucial in integrating multi-source databases ...