Web page classification is important to many tasks in information retrieval and web mining. However, applying traditional textual classifiers on web data often produces unsatisfyi...
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...
We describe efficient techniques for construction of large term co-occurrence graphs, and investigate an application to the discovery of numerous fine-grained (specific) topics. A...
The grand tour, one of the most popular methods for multidimensional data exploration, is based on orthogonally projecting multidimensional data to a sequence of lower dimensional...
Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. It has been widely used in many fields of information proces...