Semantics-Based Extraction of Relationships (SEMXTOR) in Biomedical Text
Abhishek SharmaOral Defence Date:
Thursday, May 13, 2010 - 12:15Location:
Profs. Yang, Singh and Petkovic
Advances in biotechnology and biomedical research have resulted in a large number of research findings, which are primarily published in unstructured text such as journal articles. Text mining techniques have been employed to extract knowledge from such data. In this thesis we focus on the task of identifying and extracting relations between bio-entities such as green tea and cancer. Unlike previous work that employs heuristics such as co-occurrence patterns we propose a verb-centric algorithm. This algorithm identifies and extracts the main verb(s) in a sentence. Using the main verb(s) it then extracts the two involved entities of relationship. It processes the biomedical entities by applying syntactic and linguistic features such as preposition phrases. The proposed verb-centric approach can effectively handle complex sentence structures such as clauses and conjunctive sentences. According to the evaluation results the algorithms surpasses the conventional relationship extraction algorithms.