MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-...
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, ...
Data-intensive applications are increasingly designed to execute on large computing clusters. Grouped aggregation is a core primitive of many distributed programming models, and i...
— This paper presents a distributed version of our previous work, called SAFDetection, which is a sensor analysisbased fault detection approach that is used to monitor tightlycou...
High findability of documents within a certain cut-off rank is considered an important factor in recall-oriented application domains such as patent or legal document retrieval. ...
We have studied the problem of linking event information across different languages without the use of translation systems or dictionaries. The linking is based on interlingua in...