CE-10.16

Title: 

EasyQuery: A Flexible Data Querying Tool for Complex Databases

Author(s): 

Amruta Kanetkar

Oral Defence Date: 

11/17/2010

Location: 

SCI 241

Committee: 

Professors Murphy, Petkovic & Smith (Biology)

Abstract: 

In today's computer age, scientists and professionals use SQL databases to store large amounts of complex information. Managing and exploring these databases often requires technical and domain-specific expertise. The job of accessing this information can be made somewhat easier by using Web forms, ranging from simple forms such as the Google Search page to complex forms with numerous fields. These interfaces get increasingly complex for large biological databases with complex schemas. This project attempts to address this problem by providing the user with a natural language-based user-friendly interface, called EasyQuery, to explore such complex databases. While existing systems that attempt to solve this problem require domain specific knowledge embedded in the system; our system is domain-independent, since it uses domain-agnostic templates created by expert users to generate domain specific queries. Thus our system attempts to address the hard problem of natural language query understanding by using human experts to close the gap between natural language and an internal tokenized representation of queries. EasyQuery allows experts to create SQL queries and present them in the form of model English sentence patterns to the user. It also allows users to execute the queries created by experts to explore the database with ease using an auto-suggest interface called PowerKey. We tested the feasibility of the system using the biological database AntDB. All of our software is fully functional and initial user feedback indicates that our system is easier to use than the native SQL interface.

Keywords: 

complex databases, natural language query, PowerKey, domain-agnostic templates, model English sentence patterns

Copyright: 

Amruta Kanetkar