A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Legal-RDF.org1 publishes a practical ontology that models both the layout and content of a document and metadata about the document; these have been built using data models implici...
Background: Feature selection is an important pre-processing task in the analysis of complex data. Selecting an appropriate subset of features can improve classification or cluste...
Assaf Gottlieb, Roy Varshavsky, Michal Linial, Dav...
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...