Sciweavers

GFKL
2005
Springer
167views Data Mining» more  GFKL 2005»
13 years 9 months ago
Quantitative Text Typology: The Impact of Sentence Length
Abstract. This study focuses on the contribution of sentence length for a quantitative text typology. Therefore, 333 Slovenian texts are analyzed with regard to their sentence leng...
Emmerich Kelih, Peter Grzybek, Gordana Antic, Erns...
GFKL
2005
Springer
93views Data Mining» more  GFKL 2005»
13 years 9 months ago
A Hybrid Machine Learning Approach for Information Extraction from Free Text
Abstract. We present a hybrid machine learning approach for information extraction from unstructured documents by integrating a learned classifier based on the Maximum Entropy Mod...
Günter Neumann
GFKL
2005
Springer
128views Data Mining» more  GFKL 2005»
13 years 9 months ago
Automatic Extension of Feature-based Semantic Lexicons via Contextual Attributes
We describe how a feature-based semantic lexicon can be automatically extended using large, unstructured text corpora. Experiments are carried out using the lexicon HaGenLex and th...
Chris Biemann, Rainer Osswald
GFKL
2005
Springer
114views Data Mining» more  GFKL 2005»
13 years 9 months ago
Attribute-aware Collaborative Filtering
One of the key challenges in large information systems such as online shops and digital libraries is to discover the relevant knowledge from the enormous volume of information. Rec...
Karen H. L. Tso, Lars Schmidt-Thieme
GFKL
2005
Springer
165views Data Mining» more  GFKL 2005»
13 years 9 months ago
A Market Basket Analysis Conducted with a Multivariate Logit Model
Yasemin Boztug, Lutz Hildebrandt
GFKL
2005
Springer
105views Data Mining» more  GFKL 2005»
13 years 9 months ago
Implications of Probabilistic Data Modeling for Mining Association Rules
Mining association rules is an important technique for discovering meaningful patterns in transaction databases. In the current literature, the properties of algorithms to mine ass...
Michael Hahsler, Kurt Hornik, Thomas Reutterer
GFKL
2005
Springer
105views Data Mining» more  GFKL 2005»
13 years 9 months ago
Variable Selection for Discrimination of More Than Two Classes Where Data are Sparse
In classification, with an increasing number of variables, the required number of observations grows drastically. In this paper we present an approach to put into effect the maxi...
Gero Szepannek, Claus Weihs
GFKL
2005
Springer
103views Data Mining» more  GFKL 2005»
13 years 9 months ago
Semiparametric Stepwise Regression to Estimate Sales Promotion Effects
Winfried Steiner, Christiane Belitz, Stefan Lang
GFKL
2005
Springer
101views Data Mining» more  GFKL 2005»
13 years 9 months ago
Discovering Communities in Linked Data by Multi-view Clustering
Abstract. We consider the problem of finding communities in large linked networks such as web structures or citation networks. We review similarity measures for linked objects and...
Isabel Drost, Steffen Bickel, Tobias Scheffer
GFKL
2005
Springer
142views Data Mining» more  GFKL 2005»
13 years 9 months ago
Near Similarity Search and Plagiarism Analysis
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...
Benno Stein, Sven Meyer zu Eissen