Sciweavers

WWW
2009
ACM
14 years 5 months ago
Detecting image spam using local invariant features and pyramid match kernel
Image spam is a new obfuscating method which spammers invented to more effectively bypass conventional text based spam filters. In this paper, we extract local invariant features ...
Haiqiang Zuo, Weiming Hu, Ou Wu, Yunfei Chen, Guan...
WWW
2009
ACM
14 years 5 months ago
Network analysis of collaboration structure in Wikipedia
In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes wh...
Denise van Raaij, Jürgen Lerner, Patrick Keni...
WWW
2009
ACM
14 years 5 months ago
Inferring private information using social network data
On-line social networks, such as Facebook, are increasingly utilized by many users. These networks allow people to publish details about themselves and connect to their friends. S...
Jack Lindamood, Raymond Heatherly, Murat Kantarcio...
WWW
2009
ACM
14 years 5 months ago
Large scale integration of senses for the semantic web
Nowadays, the increasing amount of semantic data available on the Web leads to a new stage in the potential of Semantic Web applications. However, it also introduces new issues du...
Jorge Gracia, Mathieu d'Aquin, Eduardo Mena
WWW
2009
ACM
14 years 5 months ago
Graph based crawler seed selection
This paper identifies and explores the problem of seed selection in a web-scale crawler. We argue that seed selection is not a trivial but very important problem. Selecting proper...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
WWW
2009
ACM
14 years 5 months ago
Less talk, more rock: automated organization of community-contributed collections of concert videos
We describe a system for synchronization and organization of user-contributed content from live music events. We start with a set of short video clips taken at a single event by m...
Lyndon S. Kennedy, Mor Naaman
WWW
2009
ACM
14 years 5 months ago
Latent space domain transfer between high dimensional overlapping distributions
Transferring knowledge from one domain to another is challenging due to a number of reasons. Since both conditional and marginal distribution of the training data and test data ar...
Sihong Xie, Wei Fan, Jing Peng, Olivier Verscheure...
WWW
2009
ACM
14 years 5 months ago
Incorporating site-level knowledge to extract structured data from web forums
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
WWW
2009
ACM
14 years 5 months ago
Mapping the world's photos
We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analy...
David J. Crandall, Lars Backstrom, Daniel P. Hutte...