Sciweavers

2424 search results - page 411 / 485
» A general algorithm for data dependence analysis
Sort
View
SDM
2009
SIAM
140views Data Mining» more  SDM 2009»
16 years 14 days ago
Straightforward Feature Selection for Scalable Latent Semantic Indexing.
Latent Semantic Indexing (LSI) has been validated to be effective on many small scale text collections. However, little evidence has shown its effectiveness on unsampled large sca...
Jun Yan, Shuicheng Yan, Ning Liu, Zheng Chen
WWW
2010
ACM
15 years 10 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
ICSE
2009
IEEE-ACM
15 years 10 months ago
WISE: Automated test generation for worst-case complexity
Program analysis and automated test generation have primarily been used to find correctness bugs. We present complexity testing, a novel automated test generation technique to ...
Jacob Burnim, Sudeep Juvekar, Koushik Sen
CIKM
2009
Springer
15 years 9 months ago
Graph-based seed selection for web-scale crawlers
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
148
Voted
MM
2009
ACM
269views Multimedia» more  MM 2009»
15 years 9 months ago
Semi-supervised topic modeling for image annotation
We propose a novel technique for semi-supervised image annotation which introduces a harmonic regularizer based on the graph Laplacian of the data into the probabilistic semantic ...
Yuanlong Shao, Yuan Zhou, Xiaofei He, Deng Cai, Hu...