Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices

Document Type

Conference Proceeding

Publication Date


Find in a Library

Catalog Record


Current biological sequence comparison tools frequently fail to recognize matches between homologs when sequence similarity is below the twilight zone of less than 25% sequence identity. By combining sequence properties and position specific scoring matrices, improved accuracy in remote homology detection is realized. This paper extends the work of Propsearch, a sequence-property-based approach to sequence searching, by incorporating a population adaptive genetic algorithm that makes use of position specific scoring matrices in feature calculation. Optimized feature weights are obtained by training a genetic algorithm and used to find homologs to a query sequence. Databases with less than 10%, 20%, and 30% sequence similarity are used to test the remote homology detector. Comparisons are made between the optimized remote homology detector and other sequence similarity programs in both accuracy and time complexity. Future considerations for position specific scoring matrices based on the original genetic algorithm are also proposed.


Presented at the 2009 International Conference on Bioinformatics and Computational Biology, Las Vegas, NV, July 13-16, 2009.

Catalog Record