Henry Baird Palo Alto Research Center
Internet services offered for human use are suffering abuse by programs ('bots, spiders, scrapers, spammers, etc). We mount a defense against such attacks with CAPTCHAs, `completely automatic public Turing tests to tell computers and humans apart;' these are special cases of `human interactive proofs' (HIPs), a class of security protocols allowing people easily to identify themselves over networks as members of given groups. I will review the five years of evolution of HIP R&D, highlights of the first NSF HIP workshop, and applications of HIPs now in use and on the horizon. One of the best ways to construct a CAPTCHA is to exploit the gap in ability between humans and machines in attempting to read images of text. I will describe two such reading-based CAPTCHAs, developed in collaborations between PARC and UC Berkeley:
PessimalPrint, motivated by studies of physics-based image degradations, uses images synthesized pseudo-randomly over certain ranges of words, typefaces, and image quality; and
BaffleText, motivated by the psychophysics of human reading, uses image-masking degradations that seem to require Gestalt perception skills.
Both CAPTCHAs have been validated by experiments on human subjects and commercial OCR machines, and both have successfully resisted attack (so far) by advanced computer-vision techniques. I'll offer proposals for an image understanding research agenda to advance further the state of the art of web security.
[Joint work with Richard Fateman, Allison Coates, Kris Popat, Monica Chew, Tom Breuel, & Mark Luk.]