We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Tables on web pages contain a huge amount of semantically explicit information, which makes them a worthwhile target for automatic information extraction and knowledge acquisition...
Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represent...
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...