This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...
As modern multi-tier systems are becoming increasingly large and complex, it becomes more difficult for system analysts to understand the overall behavior of the system, and diag...
The meaning and mathematical consequences of linearity (managing without a presumed ability to copy) are studied for a path-based model of processes which is also a model of affi...
This paper studies the impact of written language variations and the way it affects the capitalization task over time. A discriminative approach, based on maximum entropy models, ...
Splitting compound words has proved to be useful in areas such as Machine Translation, Speech Recognition or Information Retrieval (IR). Furthermore, real-time IR systems (such as...