Expanding The Definition Of Multivariate Correlation
Price
Free (open access)
Transaction
Volume
46
Pages
8
Published
2007
Size
249 kb
Paper DOI
10.2495/CMEM070191
Copyright
WIT Press
Author(s)
W. Conley
Abstract
The complexities of large scale data analysis, in our computer age, invite the development of new sophisticated statistics to help in this task. One entry into this arena is the CTSP multivariate correlation statistic. A five variable 49 line spreadsheet of data was analyzed using CTSP by Conley and the relationship was found to be linear. Presented here is a much larger example involving nine variables and 89 lines of data, where CTSP reveals a correlation that is nonlinear (exponential in this case). Specifically, nine columns (representing nine variables) of 89 lines of data are being analyzed to see if the variables they represent are correlated in some fashion (linear or nonlinear). Therefore, the data is read into the CTSP statistical analysis simulation program which is adjusted for nine variables and 89 lines of data. Then, using a generalization of the Pythagorean distance measure to nine dimensions, a shortest route connecting the 89 points in a closed loop tour is calculated. Then several random data sets of 9 x 89 size (in similar ranges to the original data) are generated and the shortest routes are calculated for them. In this case, the actual data had a much shorter shortest route than the random data’s shortest routes. Therefore, statistically speaking it can be argued that the actual data is correlated in some fashion because it is following a pattern and hence the points are more compact (closer together in nine dimensional space) leading to a shorter shortest route. The relationship is exponential in this case. This expanded view of correlation (linear or nonlinear) can complement the standard linear analysis Anderson currently uses. Additionally, a second example involving eight variables is presented for comparison purposes. Keywords: multivariate correlation, CTSP statistic, shortest route statistical test, linear and nonlinear analysis.
Keywords
multivariate correlation, CTSP statistic, shortest route statistical test, linear and nonlinear analysis.