Sciweavers

563 search results - page 51 / 113
» Assessing the Quality of Natural Language Text Data
Sort
View
JMLR
2010
137views more  JMLR 2010»
14 years 6 months ago
Covariance in Unsupervised Learning of Probabilistic Grammars
Probabilistic grammars offer great flexibility in modeling discrete sequential data like natural language text. Their symbolic component is amenable to inspection by humans, while...
Shay B. Cohen, Noah A. Smith
ICDAR
2009
IEEE
14 years 9 months ago
Page Rule-Line Removal Using Linear Subspaces in Monochromatic Handwritten Arabic Documents
In this paper we present a novel method for removing page rule lines in monochromatic handwritten Arabic documents using subspace methods with minimal effect on the quality of the...
Wael Abd-Almageed, Jayant Kumar, David S. Doermann
AI
2007
Springer
15 years 6 months ago
Learning the Semantic Meaning of a Concept from the Web
Many researchers have used text classification method in solving the ontology mapping problem. Their mapping results heavily depend on the availability of quality exemplars used as...
Yang Yu, Yun Peng
SIGMOD
2008
ACM
122views Database» more  SIGMOD 2008»
16 years 6 hour ago
Building query optimizers for information extraction: the SQoUT project
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relation...
Alpa Jain, Panagiotis G. Ipeirotis, Luis Gravano
NLDB
2004
Springer
15 years 5 months ago
On Embedding Machine-Processable Semantics into Documents
—Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an ...
Krishnaprasad Thirunarayan