Semantic information helps in identifying the context of a document. It will be interesting to find out how effectively this information can be used in recommending related docume...
We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify te...
To reduce potential discrepancies between textual and graphical content in documentation, it is possible to produce both text and graphics from a single common source. One approac...
Documents often contain inherently many concepts reflecting specific and generic aspects. To automatically generate a short summary text of documents on similar topics, it is im...