We describe a method for generating accurate, compact, human understandable text classifiers. Text datasets are indexed using Apache Lucene and Genetic Programs are used to constr...
Due to the popularity of the XML data format, several query languages for XML have been proposed, specially devised to handle data whose structure is unknown, loose, or absent. Wh...
Daniela Florescu, Donald Kossmann, Ioana Manolescu
Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...
Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...
Web search engines often federate many user queries to relevant structured databases. For example, a product related query might be federated to a product database containing thei...