Sequencing of the human genome was a great stride towards modeling cellular complexes, massive systems whose key players are proteins and DNA. A major bottleneck limiting the modeling process is structure and function annotation for the new genes. Contemporary protein structure prediction algorithms represent the sequence of every protein of known structure with a profile to which the profile of a protein sequence of unknown structure is compared for recognition. We propose a novel approach to increase the scope and resolution of protein structure profiles. Our technique locates equivalent regions among the members of a structurally similar fold family, and clusters these region by structural similarity. Equivalent substructures can then be swapped on the common regions to generate an array of profiles which represent hypothetical structures to supplement profiles of known structures. Strategies for a specific implementation of the strategy are discussed, including application to multiple template comparative modeling
Doom, T. E.,
& Raymer, M. L.
(2001). Profile Combinatorics for Fragment Selection in Comparative Protein Structure Modeling. Proceedings of the IEEE 2nd International Symposium on Bioinformatics and Bioengineering, 271-278.