A ubiquitous city is where everything is interconnected with everything else, where information is instantaneously shared. In a U-city, people can access a variety of web data in ...
Structure analysis of table form documents is an important issue because a printed document and even an electronic document do not provide logical structural information but merely...
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing(NLP) application areas. The rapid development of these resources...
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...