We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
Extremists’ exploitation of computer-mediated communications such as online forums has recently gained much attention from academia and the government. However, due to the cover...
Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when...
Giovanni Giuffrida, Wesley W. Chu, Dominique M. Ha...
We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supe...
Health care data from patients in the Arizona Health Care Cost Containment System, Arizona’s Medicaid program, provides a unique opportunity to exploit state-of-the-art data pro...