In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
One of the most important challenges for the researchers in the 21st Century is related to global heating and climate change that can have as consequence the intensiļ¬cation of na...
Luciana A. S. Romani, Ana Maria Heuminski de &Aacu...
DNS is one of the most actively used distributed databases on earth, accessed by millions of people every day to transparently convert host names into IP addresses and vice versa....
Background: DNA microarrays, which have been increasingly used to monitor mRNA transcripts at a global level, can provide detailed insight into cellular processes involved in resp...
Tao Han, Cathy D. Melvin, Leming M. Shi, William S...
This paper uses the URL word breaking task as an example to elaborate what we identify as crucialin designingstatistical natural language processing (NLP) algorithmsfor Web scale ...
Kuansan Wang, Christopher Thrasher, Bo-June Paul H...