As part of the Language Observatory Project [4], we have been crawling all the web space since 2004. We have collected terabytes of data mostly from Asian and African ccTLDs. In t...
Rizza Camus Caminero, Pavol Zavarsky, Yoshiki Mika...
Abstract The protection of customer privacy is a fundamental issue in today's corporate marketing strategies. Not surprisingly, many research efforts have proposed new privacy...
This work investigates the accuracy and efficiency tradeoffs between centralized and collective (distributed) algorithms for (i) sampling, and (ii) n-way data analysis techniques i...
In today’s data-rich networked world, people express many aspects of their lives online. It is common to segregate different aspects in different places: you might write opinion...
Dan Frankowski, Dan Cosley, Shilad Sen, Loren G. T...
Digital archives protect important data collections from failures by making multiple copies at other archives, so that there are always several good copies of a collection. In a c...