Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
Name tagging is a critical early stage in many natural language processing pipelines. In this paper we analyze the types of errors produced by a tagger, distinguishing name classi...
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...
Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collabo...
Non-negative tensor factorization (NTF) is a relatively new technique that has been successfully used to extract significant characteristics from polyadic data, such as data in s...