Classification Algorithms And Analyzing The Functionality Of Protein Families
Price
Free (open access)
Volume
35
Pages
13
Published
2005
Size
376 kb
Paper DOI
10.2495/DATA050431
Copyright
WIT Press
Author(s)
L. Gao & D. K. Y. Chiu
Abstract
The rapid growth of bio-sequence data has resulted in an increasing demand for reliable algorithms that group proteins in a meaningful way. Many traditional classification and clustering algorithms have been adapted or directly applied to protein sequences or structures. To capture protein functionality, new algorithms have recently been proposed specifically aiming at incorporating protein functions. In this paper, we review some of the classification and clustering algorithms for proteins. We divide algorithms into four categories based on their use of dissimilarity measure, density characteristics, computational modeling and information of evidence. The algorithm based on information of evidence analyzes the biomolecular sequences as discrete-valued n-tuples such that discrete values rather than their variables are selected as evidence for the final groupings. The advantage of this approach is that the configuration of the final clusters does not depend on a reliable distance measure, a predefined computational model of the clusters, reliability of the adaptive learning method or a measure of the density function. Finally, the methods are reviewed with respect to the quality of reflecting functionality of the protein family. Keywords: protein families, protein functionality, classification algorithm, clustering algorithm. 1 Introduction Proteins are building blocks of organisms and fundamental substances of life that play an important role in executing and regulating many biological processes.
Keywords
protein families, protein functionality, classification algorithm,clustering algorithm.