Documents in HTML format have many features to analyze, from the terms in special sections to the phrases that appear in the whole document. However, it is important to decide whi...
Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations...
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve suffici...
e about image features can be expressed as a hierarchical structure called a Type Abstraction Hierarchy (TAH). TAHs can be generated automatically by clustering algorithms based on...
Wesley W. Chu, Alfonso F. Cardenas, Ricky K. Taira