Abstract. This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, ...
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
Data acquisition is a major concern in text classification. The excessive human efforts required by conventional methods to build up quality training collection might not always b...
Historical prices are important information that can help consumers decide whether the time is right to buy a product. They provide both a context to the users, and facilitate the...
As semantic markup becomes ubiquitous, it will become important to be able to ask queries and obtain answers, using natural language (NL) expressions, rather than the keyword-based...