A given entity, representing a person, a location or an organization, may be mentioned in text in multiple, ambiguous ways. Understanding natural language requires identifying whe...
Metric distances and the more general concept of dissimilarities are widely used tools in instance-based learning methods and very especially in the nearestneighbor classification...
We describe efficient techniques for construction of large term co-occurrence graphs, and investigate an application to the discovery of numerous fine-grained (specific) topics. A...
ct This paper presents an approach to exploit free text descriptions of TV programmes as available from EPG data sets for a recommendation system that takes the content of programm...
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...