A System for Extracting Study Design Parameters from Nutritional Genomics Abstracts


Cassidy Kelly

Oral Defence Date: 



TH 434


Professors Hui Yang, Dragutin Petkovic, and Robert Wall


The extraction of study design parameters from biomedical journal articles is an important problem in natural language processing (NLP). Such parameters define the characteristics of a study, such as the duration, the number of subjects, and their profile. In this thesis, we present a system for extracting study design parameters from sentences in article abstracts. This system will be used as a component of a larger system for creating nutrigenomics networks from articles in the nutritional genomics domain. The algorithms presented consist of manually designed rules expressed either as regular expressions or in terms of sentence parse structure. A number of filters and NLP tools are also utilized within a pipelined algorithmic framework. Using this novel approach, our system performs extraction at a finer level of granularity than comparable systems, while generating results that surpass the current state of the art.


Study design parameter extraction, study design parameters, extraction, extraction framework, information extraction, named entity recognition, natural language processing, granularity, nutritional genomics, nutrigenomics networks


Cassidy Kelly