In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Real-world databases often contain syntactic and semantic errors, in spite of integrity constraints and other safety measures incorporated into modern DBMSs. We present ERACER, an...
Detecting outliers in data is an important problem with interesting applications in a myriad of domains ranging from data cleaning to financial fraud detection and from network i...
Gustavo Henrique Orair, Carlos Teixeira, Ye Wang, ...
Analyzing the quality of data prior to constructing data mining models is emerging as an important issue. Algorithms for identifying noise in a given data set can provide a good me...
Jason Van Hulse, Taghi M. Khoshgoftaar, Haiying Hu...
Millions of users retrieve information from the Internet using search engines. Mining these user sessions can provide valuable information about the quality of user experience and...