When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...
Rescaling is possibly the most popular approach to cost-sensitive learning. This approach works by rescaling the classes according to their costs, and it can be realized in differ...
Most information extraction systems either use hand written extraction patterns or use a machine learning algorithm that is trained on a manually annotated corpus. Both of these a...
Recently web-based educational systems collect vast amounts of data on user patterns, and data mining methods can be applied to these databases to discover interesting associations...
Behrouz Minaei-Bidgoli, Gerd Kortemeyer, William F...
This study borrowed sequence analysis techniques from the genetic sciences and applied them to a similar problem in email filtering and web searching. Genre identification is the ...