Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...
CzEng 0.9 is the third release of a large parallel corpus of Czech and English. For the current release, CzEng was extended by significant amount of texts from various types of so...
Local Gabor features (jets) have been widely used in face recognition systems. Once the sets of jets have been extracted from the two faces to be compared, a proper measure of simi...
Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been...