Data Scale Reduction Via Instances Summarization Using The Rough Set Theory
Price
Free (open access)
Volume
25
Pages
10
Published
2000
Size
1,043 kb
Paper DOI
10.2495/DATA000271
Copyright
WIT Press
Author(s)
G. Gaumer & M. Quafafou
Abstract
Actually, the major obstacle encountered when applying Data Mining algorithms to real life data is the incapacity of these algorithms to handle very large data such as those stored in industrial databases. Developing new algorithm which require less memory and processing time will certainly help to solve this problem. But we followed here another way to solution, the reduction of the size of input data. We present in this article our new system CFSumm, which is dedicated to data summarization considered as a pre-process step before the use of a Data Min- ing Tool. The basic idea of this method is to summarize several instances suffi- ciently similar by a weighted pseudo-instance which can replace them for further processes. We explain in this article how the a-Rough Set Theory framework al- lows a
Keywords