We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document co...
We study graphical modeling in the case of stringvalued random variables. Whereas a weighted finite-state transducer can model the probabilistic relationship between two strings, ...
A software product line (SPL) is a set of software systems with well-defined commonalities and variabilities that are developed by managed reuse of common artifacts. In this pape...
We address the problem of simplifying Portuguese texts at the sentence level treating it as a "translation task". We use the Statistical Machine Translation (SMT) framewo...