Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form...
Aditya G. Parameswaran, Hector Garcia-Molina, Anan...
One challenge in text processing is the treatment of case insensitive documents such as speech recognition results. The traditional approach is to re-train a language model exclud...
Cheng Niu, Wei Li 0003, Jihong Ding, Rohini K. Sri...
Exploiting lexical and semantic relationships in large unstructured text collections can significantly enhance managing, integrating, and querying information locked in unstructur...
Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given sour...