Sciweavers

KDD
1997
ACM

SIPping from the Data Firehose

13 years 8 months ago
SIPping from the Data Firehose
When mining large databases, the data extraction problem and the interface between the database and data mining algorithm become important issues. Rather than giving a mining algorithm full accessto a database (by extracting to a flat file or other directlyaccessible data structure), we propose the SQL Interface Protocol (SIP), which is a framework for interaction between a mining algorithm and a database. The data continues to reside entirely within the database management system (DBMS), but the query interface to the database gives the data mining algorithm sufficient information to discover the same patterns it would have found with direct access to the data. This model of interaction brines several advantages; for ex--- -ample, it allows a mining algorithm to be parallelized automatically just by using a parallelized DBMS to answer queries. We show how two families of mining algorithms may be implemented as “SIPpers,” and we discuss related work in databases that should furthe...
George H. John, Brian Lent
Added 08 Aug 2010
Updated 08 Aug 2010
Type Conference
Year 1997
Where KDD
Authors George H. John, Brian Lent
Comments (0)