We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs using a...
tween documents. They should allow for an abstract representation of data which resembles the way they are actually perceived and used in the real world, thus shortening (with resp...
Learning structured representations has emerged as an important problem in many domains, including document and Web data mining, bioinformatics, and image analysis. One approach t...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
Test collections are the primary drivers of progress in information retrieval. They provide a yardstick for assessing the effectiveness of ranking functions in an automatic, rapi...
Nima Asadi, Donald Metzler, Tamer Elsayed, Jimmy L...
A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product in...