The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Digital watermarking is a growing research area to mark digital content by embedding information into the content itself. Perceptual hashing is used to identify a specific content...
m, modules, types and operations), different kinds of abstractions (functional/data, types/objects etc.) without falling into a loose collection of diagram languages. Considering a...
We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several de...
Scientists depend on literature search to find prior work that is relevant to their research ideas. We introduce a retrieval model for literature search that incorporates a wide ...