Sciweavers

IIWAS
2008
13 years 6 months ago
Combining content extraction heuristics: the CombinE system
The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Conte...
Thomas Gottron