Novel Pruning Based Hierarchical Agglomerative Clustering For Mining Outliers In Financial Time Series
Price
Free (open access)
Volume
41
Pages
10
Page Range
33 - 42
Published
2008
Size
468 kb
Paper DOI
10.2495/CF080041
Copyright
WIT Press
Author(s)
D. Wang, P. J. Fortier & H. E. Michel
Abstract
Investors must make informed decisions using partial and imperfect information. As accuracy and completeness of information held by the investor rise, the probability for better decision making also rises. Similarity search based outlier detection in financial time series is key to making better decisions for many investment strategies and portfolio management techniques. This motivates people to utilize numerous data mining techniques to discover similarities from massive financial time series data pools. The research introduces a novel pruning based Hierarchical Agglomerative Clustering (HAC) algorithm to search for similarity among financial time series in high dimensional space using securities in the S&P500 index as experimental data. The algorithm is based on vertical and horizontal dimension reduction algorithms [11] and a unique similarity measurement definition [12] with the time value concept. This paper discloses a series of experiment results that illustrate the effectiveness of the algorithm. Keywords: outlier, data mining, computational finance, financial time series, similarity search, high dimension, clustering. 1 Introduction We propose a novel similarity search in high dimensional financial time series by using a pruning based HAC algorithm. The similarity search is performed after dimensionality reduction, which composes of an Attributes Selection (AS) algorithm [11] and a Piecewise Linear Representation (PLR) based Segmentation
Keywords
outlier, data mining, computational finance, financial time series,similarity search, high dimension, clustering.