A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
The present study explores the vocal intensity of turn-initial cue phrases in a corpus of dialogues in Swedish. Cue phrases convey relatively little propositional content, but hav...
Abstract— Spelling errors when typing a URL can be exploited by website-squatters: users are led to phony sites in a phenomenon we call parasitic URL naming. These phony sites im...
Out of vocabulary (OOV) words are problematic for cross language information retrieval. One way to deal with OOV words when the two languages have different alphabets, is to trans...
Background: Frequently, several alternative names are in use for biological objects such as genes and proteins. Applications like manual literature search, automated text-mining, ...