As XML database sizes grow, the amount of space used for storing the data and auxiliary data structures becomes a major factor in query and update performance. This paper presents...
Rule systems have failed to attract much interest in large data analysis problems because they tend to be too simplistic to be useful or consist of too many rules for human interpr...
Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis ...
To improve data availability and resilience MapReduce frameworks use file systems that replicate data uniformly. However, analysis of job logs from a large production cluster show...
The increasing availability of electronic communication data, such as that arising from e-mail exchange, presents social and information scientists with new possibilities for char...
R. Dean Malmgren, Jake M. Hofman, Luis A. N. Amara...