Semantics-Based Extraction of Relationships (SEMXTOR) in Biomedical Text


Abhishek Sharma

Oral Defence Date: 

Thursday, May 13, 2010 - 12:15


HH 301


12:15 PM


Profs. Yang, Singh and Petkovic


Advances in biotechnology and biomedical research have resulted in a large number of research findings, which are primarily published in unstructured text such as journal articles. Text mining techniques have been employed to extract knowledge from such data. In this thesis we focus on the task of identifying and extracting relations between bio-entities such as green tea and cancer. Unlike previous work that employs heuristics such as co-occurrence patterns we propose a verb-centric algorithm. This algorithm identifies and extracts the main verb(s) in a sentence. Using the main verb(s) it then extracts the two involved entities of relationship. It processes the biomedical entities by applying syntactic and linguistic features such as preposition phrases. The proposed verb-centric approach can effectively handle complex sentence structures such as clauses and conjunctive sentences. According to the evaluation results the algorithms surpasses the conventional relationship extraction algorithms.

Abhishek Sharma

Biomedical literature, relationship extraction, verb-centric methodology, syntactic and linguistic information, participating relational entities.