Term extraction relates to extracting the most characteristic or important terms (words or phrases) in a document. This information is commonly used for improving the accuracy of ...
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
We present a diff algorithm for XML data. This work is motivated by the support for change control in the context of the Xyleme project that is investigating dynamic warehouses ca...
With the rapid growth of the Internet, most of the textual data in the form of newspapers, magazines and journals tend to be available on-line. Summarizing these texts can aid the...
P. Arun Kumar, K. Praveen Kumar, T. Someswara Rao,...
In [3], we introduced a framework for querying and updating probabilistic information over unordered labeled trees, the probabilistic tree model. The data model is based on trees ...