In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
As the use of Electronic Medical Records (EMRs) becomes more widespread, so does the need for effective information discovery on them. Recently proposed EMR standards are XML-based...
Testing reachability between nodes in a graph is a well-known problem with many important applications, including knowledge representation, program analysis, and more recently, bi...
Social media sharing web sites like Flickr allow users to annotate images with free tags, which significantly facilitate Web image search and organization. However, the tags assoc...
Nowadays numerous social images have been emerging on the Web. How to precisely label these images is critical to image retrieval. However, traditional image-level tagging methods...
Yang Yang, Yi Yang, Zi Huang, Heng Tao Shen, Feipi...