|
|
| Graduate Seminar
Colloquia |
| |
| |
| Date
and Time: |
Wednesday, April 22nd, 2009
at
5:30PM
|
| Location: |
Thornton
Hall 331 |
| Presenter: |
Byron Dom, Principal Research Scientist, Yahoo! |
| Subject: |
Automatic Question Categorization in Yahoo! Answers |
| Abstract: |
Yahoo! Answers addresses the problem of information retrieval by providing a forum in which users
can ask questions and other users provide answers. In
addition to providing this forum, as these questions and
answers accumulate over time in the Yahoo! Answers database,
they form a tremendous reservoir of information that can be
searched and/or browsed to find specific facts, advice and
so on. A second dimension of Yahoo! Answers is social
interaction.
In addition to providing a platform and infrastructure for
this activity, technology can be used to facilitate the
tasks that users perform in such a system. An example is the
categorization of questions by topic, location and type.
This talk will begin with a brief description of Yahoo!
Answers including descriptions of the problems on which
machine-learning techniques can be brought to bear.
Following that the question-categorization application and
associated machine-learning techniques will be described in
more detail, focusing on the way in which the application of
machine learning significantly facilitates tasks faced by
users of Yahoo! Answers.
|
| |
Byron Dom is a Principal Research Scientist in
Yahoo's Natural Language Processing department. Prior to
taking this position, he was Director of Automated Content
Analysis in the Yahoo! Applied Research division, where he
and his team focused on applying machine learning to
problems in text mining such as document categorization,
clustering and information extraction. He led the
development of Yahoo's first production automatic
document-categorization system, which categorizes merchant
product offers for Yahoo Shopping. Prior to joining Yahoo!
in April of 2003, he spent twenty years in IBM's
Research division (Watson Research Center in Yorktown
Heights, New York and Almaden Research Center in San Jose,
California) first as a physicist and later as a computer
scientist, when his physics research led him into the field
of computer vision. His work eventually led him into the
areas of machine learning and automated text analysis, which
are where he focuses his work today. At IBM he was Research
Staff Member and manager of Information Management
Principles, which focused on Web information retrieval and
text mining. He has served as an associate editor of the
IEEE Transactions on Pattern Analysis and Machine
Intelligence (PAMI). He received a PhD in physics from the
Catholic University of America. |
|
| |
| |
|