Sciweavers

602 search results - page 77 / 121
» Integrating Data and Probabilistically Structured Text Docum...
Sort
View
NIPS
2007
15 years 1 months ago
Mining Internet-Scale Software Repositories
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
JCDL
2009
ACM
127views Education» more  JCDL 2009»
15 years 6 months ago
Finding topic trends in digital libraries
We propose a generative model based on latent Dirichlet allocation for mining distinct topics in document collections by integrating the temporal ordering of documents into the ge...
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G...
ADBIS
2008
Springer
142views Database» more  ADBIS 2008»
15 years 6 months ago
Evaluating Performance and Quality of XML-Based Similarity Joins
A similarity join correlating fragments in XML documents, which are similar in structure and content, can be used as the core algorithm to support data cleaning and data integratio...
Leonardo Ribeiro, Theo Härder
CIKM
2008
Springer
15 years 1 months ago
Web-scale named entity recognition
Automatic recognition of named entities such as people, places, organizations, books, and movies across the entire web presents a number of challenges, both of scale and scope. Da...
Casey Whitelaw, Alexander Kehlenbeck, Nemanja Petr...
DOCENG
2004
ACM
15 years 5 months ago
Creation of topic map by identifying topic chain in chinese
XML Topic maps enable multiple, concurrent views of sets of information objects and can be used to different applications. For example, thesaurus-like interfaces to corpora, navig...
Ching-Long Yeh, Yi-Chun Chen