Sciweavers

1261 search results - page 242 / 253
» Extracting Text from PostScript
Sort
View
84
Voted
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 4 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
ICSM
2005
IEEE
15 years 3 months ago
Co-Change Visualization
Clustering layouts of software systems combine two important aspects: they reveal groups of related artifacts of the software system, and they produce a visualization of the resul...
Dirk Beyer
AGENTS
1997
Springer
15 years 1 months ago
A Scalable Comparison-Shopping Agent for the World-Wide Web
The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics...
Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld
DMIN
2006
146views Data Mining» more  DMIN 2006»
14 years 11 months ago
A Comparison of Two Document Clustering Approaches for Clustering Medical Documents
Medical data is often presented as free text in the form of medical reports. Such documents contain important information about patients, disease progression and management, but ar...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
SEMWEB
2010
Springer
14 years 7 months ago
Supporting Natural Language Processing with Background Knowledge: Coreference Resolution Case
Systems based on statistical and machine learning methods have been shown to be extremely effective and scalable for the analysis of large amount of textual data. However, in the r...
Volha Bryl, Claudio Giuliano, Luciano Serafini, Ka...