Gowri, VS and Tina, KG and Krishnadev, O and Srinivasan, N (2007) Strategies for the Effective Identification of Remotely Related Sequences in Multiple PSSM Search Approach. In: Proteins: Structure, Function, and Bioinformatics, 67 (4). 789 -794.
Restricted to Registered users only
Download (187Kb) | Request a copy
Searches using position specific scoring matrices (PSSMs) have been commonly used in remote homology detection procedures such as PSI-BLAST and RPS-BLAST. A PSSM is generated typically using one of the sequences of a family as the reference sequence. In the case of PSIBLAST searches the reference sequence is same as the query. Recently we have shown that searches against the database of multiple family-profiles, with each one of the members of the family used as a reference sequence, are more effective than searches against the classical database of single family-profiles. Despite relatively a better overall performance when compared with common sequence-profile matching procedures, searches against the multiple family-profiles database result in a few false positives and false negatives. Here we show that profile length and divergence of sequences used in the construction of a PSSM have major influence on the performance of multiple profile based search approach. We also identify that a simple parameter defined by the number of PSSMs corresponding to a family that is hit, for a query, divided by the total number of PSSMs in the family can distinguish effectively the true positives from the false positives in the multiple profiles search approach.
|Item Type:||Journal Article|
|Additional Information:||Copyright of this article belongs to Wiley.|
|Keywords:||Position specific scoring matrix;Protein profiles;Protein families;Remote homology detection;Sequence analysis|
|Department/Centre:||Division of Biological Sciences > Molecular Biophysics Unit|
|Date Deposited:||27 Aug 2007|
|Last Modified:||19 Sep 2010 04:39|
Actions (login required)