Sciweavers

1577 search results - page 233 / 316
» Data Mining: Machine Learning, Statistics, and Databases
Sort
View
EMNLP
2008
15 years 3 months ago
Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
Lei Shi, Ming Zhou
DEBU
2000
101views more  DEBU 2000»
15 years 1 months ago
Learning to Understand the Web
In a traditional information retrieval system, it is assumed that queries can be posed about any topic. In reality, a large fraction of web queries are posed about a relatively sm...
William W. Cohen, Andrew McCallum, Dallan Quass
FQAS
2004
Springer
135views Database» more  FQAS 2004»
15 years 5 months ago
Interactive Schema Integration with Sphinx
Abstract. The Internet has instigated a critical need for automated tools that facilitate integrating countless databases. Since non-technical end users are often the ultimate repo...
François Barbançon, Daniel P. Mirank...
TSE
2008
91views more  TSE 2008»
15 years 1 months ago
Privately Finding Specifications
Buggy software is a reality and automated techniques for discovering bugs are highly desirable. A specification describes the correct behavior of a program. For example, a file mus...
Westley Weimer, Nina Mishra
KDD
2006
ACM
381views Data Mining» more  KDD 2006»
16 years 2 months ago
GPLAG: detection of software plagiarism by program dependence graph analysis
Along with the blossom of open source projects comes the convenience for software plagiarism. A company, if less self-disciplined, may be tempted to plagiarize some open source pr...
Chao Liu 0001, Chen Chen, Jiawei Han, Philip S. Yu