Information available in the Internet is frequently supplied simply as plain ascii text, structured according to orthographic and semantic conventions. Traditional document classi...
Many approaches to Information Extraction (IE) have been proposed in literature capable of finding and extract specific facts in relatively unstructured documents. Their applicatio...
WHIRL is an extensionof relational databasesthat canperform "soft joins" basedon the similarity of textual identifiers;thesesoftjoins extendthe traditional operationof j...