Determining Molecular Similarity and its Applications

Wednesday, March 17, 2004 - 17:30
TH 331
Rahul Singh, Georgia Institute of Technology
The challenge of determining similarity amongst molecules is a fundamental one in contemporary biology and drug discovery. Since similar molecules tend to have similar biological properties, the notion of molecular similarity plays an important role in contexts such as query-retrieval in molecular databases, exploration of molecular structural space, and development of structure-activity models for molecules of bio-chemical interest. The problem of determining molecular similarity is strongly tied to the problem of molecular representation. At the current state-of-the-art, molecular representations with high descriptive power like 3D surface-based representations are available. However, most approaches have tended to focus on 2D graph-based molecular similarity due to the complexity that accompanies reasoning with more elaborate representations. In this talk, I will present research towards ascertaining similarity of molecules when they are defined using complex surface-based representations. The essence of this research lies in determining an intrinsic, spherical representation of a molecule that maps points on the molecular surface to points on a standard coordinate system (a sphere). Properties like molecular geometry, molecular fields, and effects due to field super positioning can then be captured as distributions on the surface of a sphere that encapsulates a molecule. Similarity between molecules is determined by computing the similarity of the corresponding property distributions using a novel, topologically-constrained formulation of histogram-intersection. This formulation obviates explicit pose-optimization of the variation of molecules, and facilitates highly efficient (rapid) determination of molecular similarity. The efficacy of this research is demonstrated in terms of recognition performance, application in building structure-property models for complex biological properties, and through comparisons with existing research and commercial approaches.

Rahul Singh is currently a faculty member in the department of Electrical and Computer Engineering at Georgia Institute of Technology. His technical interests are in bioinformatics, computational drug discovery, multimedia information modeling and management, and computer vision and its applications. Prior to joining Georgia Tech, he was in the industry (in the San Francisco-Bay area) at Scimagix, where he worked on various problems related to multimedia biological information management and at Exelixis, where he headed the computational drug discovery group, which worked on various problems across the genomics-drug discovery spectrum. Dr. Singh received his MSE degree in Computer Science with "excellence" from the Moscow Power Engineering Institute and the MS and PhD degrees in Computer Science from the University of Minnesota.