Large-scale data analysis has become increasingly important for many enterprises. Recently, a new distributed computing paradigm, called MapReduce, and its open source implementat...
From Proc. CAiSE05 LNCS 3520, Pages 460-474 c Springer-Verlag 2005 Semi-structured data sources, such as XML, HTML or CSV files, present special problems when performing data int...
The fundamental drawback of unstructured peer-to-peer (P2P) networks is the flooding-based query processing protocol that seriously limits their scalability. As a result, a signi...
: In many application areas like e-science and data-warehousing detailed information about the origin of data is required. This kind of information is often referred to as data pro...
In this demonstration, we will present LEARNPADS, a fully automatic system for generating ad hoc data processing tools. When presented with a collection of ad hoc data, the system...