We introduce an experimental search engine for multilingual and multimedia information, employing a holistic web interface and enabling the use of highly distributed indices. Moda...
Konstantinos Zagoris, Avi Arampatzis, Savvas A. Ch...
In this paper we introduce the webpage understanding problem which consists of three subtasks: webpage segmentation, webpage structure labeling, and webpage text segmentation and ...
We describe how a feature-based semantic lexicon can be automatically extended using large, unstructured text corpora. Experiments are carried out using the lexicon HaGenLex and th...
Sentiment analysis seeks to characterize opinionated or evaluative aspects of natural language text. We suggest here that appraisal expression extraction should be viewed as a fun...
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...