The semi-structured information available in HTML and similar documents provide valuable information that can be used for information extraction applications. This information tog...
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
This paper presents a novel method for the classification of images that combines information extracted from the images and contextual information. The main hypothesis is that con...
Medical data is often presented as free text in the form of medical reports. Such documents contain important information about patients, disease progression and management, but ar...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...