Sciweavers

OTM
2009
Springer
13 years 11 months ago
Merging Sets of Taxonomically Organized Data Using Concept Mappings under Uncertainty
Abstract. We present a method for using aligned ontologies to merge taxonomically organized data sets that have apparently compatible schemas, but potentially different semantics f...
David Thau, Shawn Bowers, Bertram Ludäscher
IPOM
2009
Springer
13 years 11 months ago
A Labeled Data Set for Flow-Based Intrusion Detection
Abstract. Flow-based intrusion detection has recently become a promising security mechanism in high speed networks (1-10 Gbps). Despite the richness in contributions in this field...
Anna Sperotto, Ramin Sadre, Frank van Vliet, Aiko ...
GBRPR
2009
Springer
13 years 11 months ago
On Computing Canonical Subsets of Graph-Based Behavioral Representations
The collection of behavior protocols is a common practice in human factors research, but the analysis of these large data sets has always been a tedious and time-consuming process....
Walter C. Mankowski, Peter Bogunovich, Ali Shokouf...
EMO
2009
Springer
147views Optimization» more  EMO 2009»
13 years 11 months ago
Application of MOGA Search Strategy to SVM Training Data Selection
When training Support Vector Machine (SVM), selection of a training data set becomes an important issue, since the problem of overfitting exists with a large number of training da...
Tomoyuki Hiroyasu, Masashi Nishioka, Mitsunori Mik...
DEXA
2009
Springer
177views Database» more  DEXA 2009»
13 years 11 months ago
A Versatile Record Linkage Method by Term Matching Model Using CRF
We solve the problem of record linkage between databases where record fields are mixed and permuted in different ways. The solution method uses a conditional random fields model...
Quang Minh Vu, Atsuhiro Takasu, Jun Adachi
WSOM
2009
Springer
13 years 11 months ago
Analytic Comparison of Self-Organising Maps
Abstract. SOMs have proven to be a very powerful tool for data analysis. However, comparing multiple SOMs trained on the same data set using different parameters or initialisation...
Rudolf Mayer, Robert Neumayer, Doris Baum, Andreas...
ICASSP
2009
IEEE
13 years 11 months ago
Gaussian Backend design for open-set language detection
This paper proposes a new approach to the challenging open-set language detection task. Most state-of-the-art approaches make use of data sources with several out-of-set languages...
Mohamed Faouzi BenZeghiba, Jean-Luc Gauvain, Lori ...
EDBT
2009
ACM
136views Database» more  EDBT 2009»
13 years 11 months ago
On the comparison of microdata disclosure control algorithms
Privacy models such as k-anonymity and -diversity typically offer an aggregate or scalar notion of the privacy property that holds collectively on the entire anonymized data set....
Rinku Dewri, Indrajit Ray, Indrakshi Ray, Darrell ...
KDD
2008
ACM
140views Data Mining» more  KDD 2008»
14 years 5 months ago
Semi-supervised approach to rapid and reliable labeling of large data sets
Supervised classification methods have been shown to be very effective for a large number of applications. They require a training data set whose instances are labeled to indicate...
György J. Simon, Vipin Kumar, Zhi-Li Zhang
ICDT
2009
ACM
121views Database» more  ICDT 2009»
14 years 5 months ago
Optimal splitters for database partitioning with size bounds
Partitioning is an important step in several database algorithms, including sorting, aggregation, and joins. Partitioning is also fundamental for dividing work into equal-sized (o...
Kenneth A. Ross, John Cieslewicz