Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
We present a novel probabilistic multiple cause model for binary observations. In contrast to other approaches, the model is linear and it infers reasons behind both observed and ...
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
Uploading tourist photos is a popular activity on photo sharing platforms. These photographs and their associated metadata (tags, geo-tags, and temporal information) should be use...
If Kolmogorov complexity [25] measures information in one object and Information Distance [4, 23, 24, 42] measures information shared by two objects, how do we measure information...