A wealth of information is available on the Web. But often, such data are hidden behind form interfaces which allow only a restrictive set of queries over the underlying databases...
We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from ...
Bing Zhao, Nguyen Bach, Ian R. Lane, Stephan Vogel
—Web 2.0 applications, including blogs, wikis and social networking sites, pose challenging privacy issues. Many users are unaware that search engines index personal information ...
Michael Hart, Claude Castille, Rob Johnson, Amanda...
A re-ranking technique,called “PageRank brings a successful story behind the search engine. Many studies focus on finding an way to compute the PageRank scores of a large web gr...
In this paper we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous data objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlinks...
Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, ...