A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of het...
Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. We formulate the problem of mining (embedded) subtrees in ...
Twarql is an infrastructure translating microblog posts from Twitter as Linked Open Data in real-time. The approach employed in Twarql can be summarized as follows: (1) extract co...
Pablo N. Mendes, Alexandre Passant, Pavan Kapanipa...
Objective: We present an integrated set of technologies, known as the Hippocratic Database, that enable healthcare enterprises to comply with privacy and security laws without imp...