Abstract. The focus of web search is moving away from returning relevant documents towards returning structured data as results to user queries. A vital part in the architecture of...
Heterogeneous data co-clustering has attracted more and more attention in recent years due to its high impact on various applications. While the co-clustering algorithms for two t...
Bin Gao, Tie-Yan Liu, Xin Zheng, QianSheng Cheng, ...
We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model tha...
A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
Large XML data files, or XML databases, are now a common way to distribute scientific and bibliographic data, and storing such data efficiently is an important concern. A number o...