Sciweavers

241 search results - page 3 / 49
» Detecting Co-Derivative Documents in Large Text Collections
Sort
View
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 6 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
HIKM
2006
ACM
13 years 11 months ago
Automatic document indexing in large medical collections
Term extraction relates to extracting the most characteristic or important terms (words or phrases) in a document. This information is commonly used for improving the accuracy of ...
Angelos Hliaoutakis, Kalliopi Zervanou, Euripides ...
ITCC
2003
IEEE
13 years 11 months ago
A Method for Calculating Term Similarity on Large Document Collections
We present an efficient algorithm called the Quadtree Heuristic for identifying a list of similar terms for each unique term in a large document collection. Term similarity is de...
Wolfgang W. Bein, Jeffrey S. Coombs, Kazem Taghva
ICAIL
2005
ACM
13 years 11 months ago
Effective Document Clustering for Large Heterogeneous Law Firm Collections
Computational resources for research in legal environments have historically implied remote access to large databases of legal documents such as case law, statutes, law reviews an...
Jack G. Conrad, Khalid Al-Kofahi, Ying Zhao, Georg...
DL
2000
Springer
162views Digital Library» more  DL 2000»
13 years 10 months ago
Snowball: extracting relations from large plain-text collections
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Eugene Agichtein, Luis Gravano