For extracting the characteristics a specific geographic entity, and notably a place, we propose to use dynamic Extreme Tagging Systems in combination with the classic approach of...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML documents essentially reduces to learning deterministic regular expressions from sets of ...
Geert Jan Bex, Wouter Gelade, Frank Neven, Stijn V...
This paper is concerned with the problem of Imbalanced Classification (IC) in web mining, which often arises on the web due to the "Matthew Effect". As web IC applicatio...
The boom of product review websites, blogs and forums on the web has attracted many research efforts on opinion mining. Recently, there was a growing interest in the finergrained ...
Qi Su, Xinying Xu, Honglei Guo, Zhili Guo, Xian Wu...