WIT Press


Supervised Knowledge Discovery From Incomplete Data

Price

Free (open access)

Volume

25

Pages

10

Published

2000

Size

1,618 kb

Paper DOI

10.2495/DATA000261

Copyright

WIT Press

Author(s)

A. Kalousis & M. Hilario

Abstract

Incomplete data can raise more or less serious problems in knowledge discovery systems depending on the quantity and pattern of missing values as well as the generalization method used. For instance, some methods are inherently resilient to missing values while others have built-in methods for coping with them. Still others require that none of the values are missing; for such methods, preliminary imputation of missing values is indispensable. After a quick overview of current practice in the machine learning field, we explore the problem of missing values from a statistical perspective. In particular, we adopt the well-known distinction between three patterns of missing values—missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR)—to focus a comparative study of

Keywords