Towards combining web classification and web information extraction: a case study

16 years 5 months ago

Download www.hpl.hp.com

: ? Towards Combining Web Classification and Web Information Extraction: a Case Study Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi HP Laboratories HPL-2009-86 Classification, Information extraction, Graphical model Web content analysis often has two sequential and separate steps: Web Classification to identify the target Web pages and Web Information Extraction to extract the metadata contained in the target Web pages. This decoupled strategy is highly ineffective since the errors in Web classification will be propagated to Web information extraction and eventually accumulate to a high level. In this paper we study the mutual dependencies between these two steps and propose to combine them by using a model of Conditional Random Fields (CRFs). This model can be used to simultaneously recognize the target Web pages and extract the corresponding metadata. Systematic experiments in our project OfCourse for online course search show that this model significantly improves the F1 ...

Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongz

Real-time Traffic

Data Mining | Graphical Model Web | KDD 2009 | Target Web Pages | Web Information Extraction |

claim paper

» Towards Flexible Mashup of Web Applications Based on Information Extraction and Transfer

» Dynamic Aggregation to Support Pattern Discovery A Case Study with Web Logs

» Combining Data and Text Mining Techniques for Yeast Gene Regulation Prediction A Case Stud...

» SWHi System Description A Case Study in Information Retrieval Inference and Visualization ...

» Scalable browsing for large collections a case study

» Automatic Location and Separation of Records A Case Study in the Genealogical Domain

» WEB Image Classification Based on the Fusion of Image and Text Classifiers

» Bringing web 20 to government research a case study

Post Info
More Details (n/a)

Added	25 Nov 2009
Updated	25 Nov 2009
Type	Conference
Year	2009
Where	KDD
Authors	Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi

Comments (0)

Sciweavers

Towards combining web classification and web information extraction: a case study

Data Mining | Graphical Model Web | KDD 2009 | Target Web Pages | Web Information Extraction |

Explore & Download

Productivity Tools

Sciweavers