The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks. In the...
A prerequisite for leveraging the vast amount of data available on the Web is Entity Resolution, i.e., the process of identifying and linking data that describe the same real-worl...
George Papadakis, Ekaterini Ioannou, Claudia Niede...
In bibliographies like DBLP and Citeseer, there are three kinds of entity-name problems that need to be solved. First, multiple entities share one name, which is called the name sh...
Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descript...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
— As person names are non-unique, the same name on different Web pages might or might not refer to the same real-world person. This entity identification problem is one of the m...