CloNI: Clustering Of Square Root Of N -interval Discretization
Price
Free (open access)
Volume
29
Pages
10
Published
2003
Size
424 kb
Paper DOI
10.2495/DATA030221
Copyright
WIT Press
Author(s)
C. Ratanamahatana
Abstract
CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically works well on discrete data. All continuous attributes then need to be discretized beforehand for such applications. An inappropriate range of discretization intervals may result in degradation of performance. In this paper, we review previous work on continuous feature discretization and conduct an empirical evaluation of an improved method called Clustering of &-Interval Discretization (CloNI). CloNI tries to reduce the number of fi intervals in the datasets by iteratively combining two consecutive intervals together, according to their median distance until a stopping criteria is met. We also show that even though C4.5 decision trees can handle continuous features, we can significantly improve its performance in some domains if those features were discretized in advance. In our empirical
Keywords