Mining Regular Spatio-Temporal Patterns in Multiple Protein Folding Trajectories


Saurabh Gupte

Oral Defence Date: 



TH 409


Professors Yang, Petkovic and Singh


The mechanism by which a protein folds into its characteristic 3D structure from a random coil is known as the protein folding process. Understanding the protein folding process is a grand challenge in molecular and structural biology. Most of the existing methods use scalar measurements such as radius of gyration but fail to take into account the substructural details. Furthermore, the secondary structure based methods cannot characterize the folding properties of the molecule. To overcome these limitations, this thesis presents a novel approach to study the folding process by characterizing the global structure of a protein molecule based on the local substructures identified in the molecule. These local substructures, termed as Folding-Aware Structure-Consious (FASC) Substructures, are derived by enriching secondary structures with the geometric and the folding properties of the molecule. The applications of these FASC Substructures include identification of structurally similar conformations and identification of recurring subsequences of the FASC Substructures across multiple trajectories.


protein folding data; protein structure analysis; 3D substructure identification; common folding subsequences analysis


Saurabh Gupte