ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification

Mondal, Sukanta and Bhavna, Rajasekaran and Babu, Rajasekaran M (2006) Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification. In: Journal of Theoretical Biology, 243 (2). pp. 252-260.

[img] PDF
Pseudo_amino_acid_composition_and_multi-class_support_vector.pdf
Restricted to Registered users only

Download (284Kb) | Request a copy

Abstract

Conotoxins are disulfide rich small peptides that target a broad spectrum of ion-channels and neuronal receptors. They offer promising avenues in the treatment of chronic pain, epilepsy and cardiovascular diseases. Assignment of newly sequenced mature conotoxins into appropriate superfamilies using a computational approach could provide valuable preliminary information on the biological and pharmacological functions of the toxins. However, creation of protein sequence patterns for the reliable identification and classification of new conotoxin sequences may not be effective due to the hypervariability of mature toxins. With the aim of formulating an in silico approach for the classification of conotoxins into superfamilies, we have incorporated the concept of pseudo-amino acid composition to represent a peptide in a mathematical framework that includes the sequence-order effect along with conventional amino acid composition. The polarity index attribute, which encodes information such as residue surface buriability, polarity, and hydropathy, was used to store the sequence-order effect. Several methods like BLAST, ISort (Intimate Sorting) predictor, least Hamming distance algorithm, least Euclidean distance algorithm and multi-class support vector machines (SVMs), were explored for superfamily identification. The SVMs outperform other methods providing an overall accuracy of 88.1 \% for all correct predictions with generalized squared correlation of 0.75 using jackknife cross-validation test for A, M, O and T superfamilies and a negative set consisting of short cysteine rich sequences from different eukaryotes having diverse functions. The computed sensitivity and specificity for the superfamilies were found to be in the range of 84.0–94.1 \% and 80.0–95.5 \%, respectively, attesting to the efficacy of multi-class SVMs for the successful in silico classification of the conotoxins into their superfamilies.

Item Type: Journal Article
Additional Information: Copyright of this article belongs to Elsevier.
Keywords: Hypermutable mature conotoxin;Superfamily classification; Pseudo-amino acid composition;Polarity index;Support vector machines (SVMs)
Department/Centre: Division of Information Sciences > BioInformatics Centre
Division of Physical & Mathematical Sciences > Physics
Date Deposited: 13 May 2008
Last Modified: 19 Sep 2010 04:44
URI: http://eprints.iisc.ernet.in/id/eprint/13947

Actions (login required)

View Item View Item