Sciweavers

1577 search results - page 233 / 316
» Data Mining: Machine Learning, Statistics, and Databases
Sort
View
EMNLP
2008
15 years 1 months ago
Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model
Parallel web pages are important source of training data for statistical machine translation. In this paper, we present a new approach to sentence alignment on parallel web pages....
Lei Shi, Ming Zhou
DEBU
2000
101views more  DEBU 2000»
14 years 11 months ago
Learning to Understand the Web
In a traditional information retrieval system, it is assumed that queries can be posed about any topic. In reality, a large fraction of web queries are posed about a relatively sm...
William W. Cohen, Andrew McCallum, Dallan Quass
FQAS
2004
Springer
135views Database» more  FQAS 2004»
15 years 3 months ago
Interactive Schema Integration with Sphinx
Abstract. The Internet has instigated a critical need for automated tools that facilitate integrating countless databases. Since non-technical end users are often the ultimate repo...
François Barbançon, Daniel P. Mirank...
84
Voted
TSE
2008
91views more  TSE 2008»
14 years 11 months ago
Privately Finding Specifications
Buggy software is a reality and automated techniques for discovering bugs are highly desirable. A specification describes the correct behavior of a program. For example, a file mus...
Westley Weimer, Nina Mishra
KDD
2006
ACM
381views Data Mining» more  KDD 2006»
16 years 8 days ago
GPLAG: detection of software plagiarism by program dependence graph analysis
Along with the blossom of open source projects comes the convenience for software plagiarism. A company, if less self-disciplined, may be tempted to plagiarize some open source pr...
Chao Liu 0001, Chen Chen, Jiawei Han, Philip S. Yu