Sciweavers

735 search results - page 27 / 147
» Corpora and data preparation
Sort
View
MLDM
2005
Springer
15 years 3 months ago
Supervised Evaluation of Dataset Partitions: Advantages and Practice
In the context of large databases, data preparation takes a greater importance : instances and explanatory attributes have to be carefully selected. In supervised learning, instanc...
Sylvain Ferrandiz, Marc Boullé
ICML
2005
IEEE
15 years 10 months ago
Hierarchical Dirichlet model for document classification
The proliferation of text documents on the web as well as within institutions necessitates their convenient organization to enable efficient retrieval of information. Although tex...
Sriharsha Veeramachaneni, Diego Sona, Paolo Avesan...
PRIB
2009
Springer
187views Bioinformatics» more  PRIB 2009»
15 years 2 months ago
Semi-supervised Prediction of Protein Interaction Sentences Exploiting Semantically Encoded Metrics
Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification ...
Tamara Polajnar, Mark A. Girolami
COLING
2008
14 years 11 months ago
Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Translation
This paper presents a general platform, namely synchronous tree sequence substitution grammar (STSSG), for the grammar comparison study in Translational Equivalence Modeling (TEM)...
Min Zhang, Hongfei Jiang, Haizhou Li, AiTi Aw, She...
FLAIRS
2006
14 years 11 months ago
Corpus Based Unsupervised Labeling of Documents
Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...
Delip Rao, Deepak P, Deepak Khemani