PRESENTED BY A.R.MOUNIKA
Abstract
We introduce captcha, an automated test that humans can pass, but current computer programs can't pass: any program that has high success over a captcha can be used to solve an unsolved Artificial Intelligence (AI) problem We introduce two families of AI problems that can be used to construct captchas and we show that solutions to such problems can be used for steganographic communication. Captchas based on these AI problem families, then, imply a Win-win situation: either the problems remain unsolved or there is away to differentiate humans from computers or the problems are solved and there is a way to communicate covertly on some channels.
Contents
Introduction Applications of the Captcha Definition and Types of Captcha Text based Captcha Image based Captcha Audio based Captcha Conclusion References
INTRODUCTION
CAPTCHAs: Completely Automated Public Turing test to tell Computers and Humans Apart With an increasing number of free services on the internet, we find a pronounced need to protect these services from abuse. Automated programs (often referred to as bots) have been designed to attack a variety of services. For example, attacks are common on free email providers to acquire accounts
Types of CAPTCHA
Generate CAPTCHA Align CAPTCHA Cut CAPTCHA Transform CAPTCHA Decode CAPTCHA
In the CAPTCHA we propose, we are careful not to provide the user with a small set of images to compare. Any similarity computation must be done against the entire set of images possible without any a priori filtering clues given. The success of our CAPTCHA rests on the fact that orienting an image is an AI-hard problem. In the next section, we will review the many systems that attempt to determine an images upright orientation. Although a few systems achieve success, their success is, when tested in realistic scenarios, limited to a small subset of image types
2.
In early March 2008, concurrent to our work, the blog of winter core Labs claimed to have successfully broken the Google audio CAPTCHA. After reading their Web article and viewing the video of how they solve the CAPTCHAs, we are unconvinced that the process is entirely automatic, and it is unclear what their exact pass rate is. Because we are unable to find any formal technical analysis of this program, we can neither be sure of its accuracy nor the extent of its automation. So this lead to the process of the following events: 1.Creation of the training data. 2.Classifier construction 3.Assessment of the current audio Captchas 4.Recommendation for the stronger Captcha
Conclusion
We have succeeded in breaking three different types of widely used audio CAPTCHAs, even though these were developed with the purpose of defeating attacks by machine learning techniques. We believe our results can be improved by selecting optimal segment sizes, but that is unnecessary given our already high success rate.
For our experiments, segment sizes were not chosen in a special way; occasionally yielding results in which a segment only contained half of a word, causing our prediction to contain that particular word twice. We also believe that the AdaBoost results can be improved, particularly for the Digg audio CAPTCHAs, by ensuring that the number of negative training samples is closer to the number of positive training samples. We have shown that our approach is successful and can be used with many different audio CAPTCHAs that contain small finite vocabularies
QUERIES??????