Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. And yet we are risking let...
AnHai Doan, Jeffrey F. Naughton, Akanksha Baid, Xi...
Many applications make use of named entity classification. Machine learning is the preferred technique adopted for many named entity classification methods where the choice of feat...
Current knowledge bases suffer from either low coverage or low accuracy. The underlying hypothesis of this work is that user feedback can greatly improve the quality of automatica...
Gjergji Kasneci, Jurgen Van Gael, Ralf Herbrich, T...
We present two methods for determining the sentiment expressed by a movie review. The semantic orientation of a review can be positive, negative, or neutral. We examine the effect...