We consider the problem of embedding arbitrary objects (e.g., images, audio, documents) into Euclidean space subject to a partial order over pairwise distances. Partial order cons...
Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular...
We present an approach of how to automatically extract an XML document structure from a conceptual data model that describes the content of the document. We use UML class diagrams ...
We describe a compression technique for semistructured documents, called SCMPPM, which combines the Prediction by Partial Matching technique with Structural Contexts Model (SCM) t...
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...