Home  |  Research Activities  | 

Geometric Invariant based detection of recurring functional sites in protein and classification of proteins based on functional sites

Protein Functional Sites Database
Download
Description
The package contains the following pieces
  1. PDB Parser and potential functional site generator: It parses PDB file and extracts C alpha, C beta and pseudo functional atom coordinates. It also generates potential functional site patterns, which are analyzed further to extract recurring functional sites in proteins. The potential functional sites contain three, four, five or six amino acids, which are closer in space and can potentially interact.
  2. Geometric Invariant Calculator calculates geometric invariants for the potential functional site patterns in proteins. The number of geometric invariants differ for different sizes of functional sites. We model functional site made up of three amino acids as triangle, four as tetrahedron, five and six as combinations of triangle and tetrahedron. Thus, functional sites are mapped in geometric invariant space.
  3. Label Separator mixes potential functional sites from all the input proteins and separates them in different groups based on their amino acid contents. For example, all potential functional sites containing DHS amino acid contents from all the proteins are combined in DHS group and so on.
  4. Clustering works on a particular pattern type, for example, DHS, at a time and groups the potential functional sites into various clusters based on their geometry encoded in form of geometric invariants. The clusters represent the structural patterns which are geometrically similar and recurring in certain number of proteins. We choose clusters containing more than 5 patterns. Such clusters indeed represent recurring structural pattern, which signifies true functional site in the set of proteins. The process is repeated for all the pattern types and clusters are found in them.
  5. Postprocessing scripts assign clusters to corresponding SCOP superfamilies and performs classification of the set of proteins based on their functional sites.
Citation Information
Ashish V. Tendulkar, Pramod P. Wangikar, Milind A. Sohoni, Vivekanand V. Samant and Chetan Y. Mone, Parameterization and Classification of the Protein Universe via Geometric Techniques, Journal of Molecular Biology, Volume 334, Issue 1, 14 November 2003, Pages 157-172. [PubMed abstract].
Supplementary Material
The supplementary material is available with the paper on Science Direct.
Technical Documents, Posters, Talks
  1. [Presentation] Iriss 2004 at IIT Bombay, India representing KReSIT: Application of Geometric Invariant Theory in Protein Substructure Clustering
  2. [Poster Paper] Geometric Invariant Theory applied to Protein Structure Classification, Pramod Wangikar, Ashish V Tendulkar, Milind Sohoni. Poster Paper at ECCB 2003, Paris, France, Sept. 27-30, 2003.