Repositories of code written by end-user programmers are beginning to emerge, but when a piece of code is new or nobody has yet reused it, then current repositories provide users ...
Christopher Scaffidi, Christopher Bogart, Margaret...
Geographical gazetteers are necessary in a wide variety of applications. In the past, the construction of such gazetteers has been a tedious, manual process and only recently have...
Adrian Popescu, Gregory Grefenstette, Houda Bouamo...
Abstract. This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, ...
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
The World-Wide Web (WWW) is an ever growing, distributed, non-administered, global information resource. It resides on the worldwide computer network and allows access to heteroge...