Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned websearch ranking — a domain notorious for very large data sets. ...
Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal...
Abstract. We deal with two important problems in pattern recognition that arise in the analysis of large datasets. While most feature subset selection methods use statistical techn...
A key challenge in applying kernel-based methods for discriminative learning is to identify a suitable kernel given a problem domain. Many methods instead transform the input data...
As the rapid growth of PDF document in digital libraries, recognizing the document structure and detecting specific document components are useful for document storage, classifica...
In this paper, we introduce weights into Pawlak rough set model to balance the class distribution of a data set and develop a weighted rough set based method to deal with the clas...