Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices
Find in a Library
Current biological sequence comparison tools frequently fail to recognize matches between homologs when sequence similarity is below the twilight zone of less than 25% sequence identity. By combining sequence properties and position specific scoring matrices, improved accuracy in remote homology detection is realized. This paper extends the work of Propsearch, a sequence-property-based approach to sequence searching, by incorporating a population adaptive genetic algorithm that makes use of position specific scoring matrices in feature calculation. Optimized feature weights are obtained by training a genetic algorithm and used to find homologs to a query sequence. Databases with less than 10%, 20%, and 30% sequence similarity are used to test the remote homology detector. Comparisons are made between the optimized remote homology detector and other sequence similarity programs in both accuracy and time complexity. Future considerations for position specific scoring matrices based on the original genetic algorithm are also proposed.
& Raymer, M. L.
(2009). Improving Remote Homology Detection Using Sequence Properties and Position Specific Scoring Matrices. Proceedings of the International Conference on Bioinformatics & Computational Biology, 103-108.