Sciweavers

8795 search results - page 158 / 1759
» Measuring Generality of Documents
Sort
View
CIKM
2008
Springer
15 years 6 months ago
Efficient techniques for document sanitization
Sanitization of a document involves removing sensitive information from the document, so that it may be distributed to a broader audience. Such sanitization is needed while declas...
Venkatesan T. Chakaravarthy, Himanshu Gupta, Prasa...
DEXA
2005
Springer
101views Database» more  DEXA 2005»
15 years 6 months ago
On the Midpoint of a Set of XML Documents
The WWW contains a huge amount of documents. Some of them share the subject, but are generated by different people or even organizations. To guarantee the interchange of such docu...
Alberto Abelló, Xavier de Palol, Mohand-Sai...
RIAO
2007
15 years 5 months ago
From Layout to Semantic: a Reranking Model for Mapping Web Documents to Mediated XML Representations
Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...
Guillaume Wisniewski, Patrick Gallinari
JUCS
2008
167views more  JUCS 2008»
15 years 4 months ago
A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives
: Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsi...
Josep Lladós, Dimosthenis Karatzas, Joan Ma...
CIKM
2010
Springer
15 years 1 months ago
Crawling the web for structured documents
Structured Information Retrieval is gaining a lot of interest in recent years, as this kind of information is becoming an invaluable asset for professional communities such as Sof...
Julián Urbano, Juan Loréns, Yorgos A...