MS2DB++: Disulfide Bond Determination by Combining Efficient Search and Machine Learning
Oral Defence Date:
Professors Rahul Singh (CS), Hui Yang (CS), & Robert Yen (Chemistry, Biochemistry)
Determining the disulfide (S-S) bond pattern in a protein is often crucial for understanding its structure and function. This is a complex problem as the possible number of disulfide bonds grows exponentially with the number of available cysteine residues. This work presents a mass spectrometry-based algorithmic approach to this problem, combined with machine learning techniques that can, with high computational efficiency, analyze multiple ions types and deal with complex bonding topologies to find the disulfide bonding topology of a protein. Experiments on nine different eukaryotic Glycosyltransferases demonstrate the success of the method, identifying disulfide bonds with efficiency, based on approximation algorithms that allow search-and-match strategies to run in polynomial time. The sensitivity and accuracy of the method is improved using predictive techniques (SVM classifier and CSP search-and-match) to determine S-S bonds where the tandem MS/MS data present insufficient resolution. The results from each framework are combined using different combination rules. MS2DB++ is available at [http://haddock2.sfsu.edu/~ms2db/ms2db++/].
MS2DB++, disulfide bond, S-S connectivity, mass spectrometry, MS2DB+, SVM, CSP