We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...
This paper describes how use the HTMLEditorKit to perform web data mining on EDGAR (Electronic Data-Gathering, Analysis, and Retrieval system). EDGAR is the SEC's (U.S. Secur...
Applying meta search systems is a suitable method to support the user if there are many different services. Due to information splitting strategies of literature services existing ...
This paper focuses on ‘user browsing graph’ which is constructed with users’ click-through behavior modeled with Web access logs. User browsing graph has recently been adopt...
A digital video library of over 900 hours of video and 18000 stories from The HistoryMakers was used by 266 students, faculty, librarians, and life-long learners interacting with ...
Michael G. Christel, Scott M. Stevens, Bryan Maher...