Supervised Knowledge Discovery From Incomplete Data
Price
Free (open access)
Volume
25
Pages
10
Published
2000
Size
1,618 kb
Paper DOI
10.2495/DATA000261
Copyright
WIT Press
Author(s)
A. Kalousis & M. Hilario
Abstract
Incomplete data can raise more or less serious problems in knowledge discovery systems depending on the quantity and pattern of missing values as well as the generalization method used. For instance, some methods are inherently resilient to missing values while others have built-in methods for coping with them. Still others require that none of the values are missing; for such methods, preliminary imputation of missing values is indispensable. After a quick overview of current practice in the machine learning field, we explore the problem of missing values from a statistical perspective. In particular, we adopt the well-known distinction between three patterns of missing values—missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR)—to focus a comparative study of
Keywords