Sciweavers

312 search results - page 43 / 63
» A General Divide and Conquer Approach for Process Mining
Sort
View
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
15 years 11 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
UML
2005
Springer
15 years 4 months ago
Extending the Boundaries of Model-Based Development to Account for Errors
This paper presents an approach for relating informed task models and system models in the domain of safety critical interactive systems. The models, which are usually developed f...
Sandra Basnyat, Rémi Bastide, Philippe A. P...
WWW
2005
ACM
15 years 12 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
KDD
2002
ACM
130views Data Mining» more  KDD 2002»
15 years 11 months ago
Learning domain-independent string transformation weights for high accuracy object identification
The task of object identification occurs when integrating information from multiple websites. The same data objects can exist in inconsistent text formats across sites, making it ...
Sheila Tejada, Craig A. Knoblock, Steven Minton
CIKM
2009
Springer
15 years 5 months ago
A case study on model driven data integration for data centric software development
Model Driven Data Integration is a data integration approach that proactively incorporates and utilizes metadata across the data integration process. By decoupling data and metada...
Hyeonsook Kim, Ying Zhang, Samia Oussena, Tony Cla...