With the increasing ubiquity of sensors and computational resources, it is becoming easier and increasingly common for people to electronically record, photographs, text, audio, a...
Stochastic context-free grammars (SCFGs) have long been recognized as useful for a large variety of tasks including natural language processing, morphological parsing, speech reco...
In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
A popular solution to internet performance problems is the widespread caching of data. Many caching algorithms have been proposed in the literature, most attempting to optimize fo...
Ganesh Santhanakrishnan, Ahmed Amer, Panos K. Chry...
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...