Background: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document ...
The core of our question-answering mechanism is searching for predefined patterns of textual expressions that may be interpreted as answers to certain types of questions. The pres...
There has been a tremendous growth in the amount of information and resources on the World Wide Web that are useful to researchers and practitioners in science domains. While the ...
Michael Chau, Zan Huang, Jialun Qin, Yilu Zhou, Hs...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...