Several years ago we started exploring what you could do with text if there was no limit on the size of the corpus or the amount of computation you could use on it. The result was a number of applications that help find interesting insights into the information hidden in these corpora - everything from what TV show 20-something women prefer to what the color of the web is.
Dr. Daniel Gruhl is a researcher at IBM's Almaden Research Center. He earned his PhD in Electrical Engineering from the Massachusetts Institute of Technology in 2000, with thesis work on distributed text analytics systems. His interests include stegonography (visual, audio, text and database), machine understanding, user modeling and very large scale text analytics.
Dr. Gruhl works in the Healthcare Informatics group and was the chief architect for the WebFountain semantic super computer.