Biosequences typically have a small alphabet, a long length, and patterns containing gaps (i.e., “don’t care”) of arbitrary size. Mining frequent patterns in such sequences ...
Discovery of sequential patterns is becoming increasingly useful and essential in many scienti c and commercial domains. Enormous sizes of available datasets and possibly large nu...
Background: Identification of protein interacting sites is an important task in computational molecular biology. As more and more protein sequences are deposited without available...
To efficiently find global patterns from a multi-database, information in each local database must first be mined and summarized at the local level. Then only the summarized infor...
— Since sequential patterns may exist in multiple sequence databases, we propose algorithm PropagatedMine+ to efficiently discover multi-domain sequential patterns. Prior works ...