Anda di halaman 1dari 16

CAPTCHA

PRESENTED BY A.R.MOUNIKA

Abstract
We introduce captcha, an automated test that humans can pass, but current computer programs can't pass: any program that has high success over a captcha can be used to solve an unsolved Artificial Intelligence (AI) problem We introduce two families of AI problems that can be used to construct captchas and we show that solutions to such problems can be used for steganographic communication. Captchas based on these AI problem families, then, imply a Win-win situation: either the problems remain unsolved or there is away to differentiate humans from computers or the problems are solved and there is a way to communicate covertly on some channels.

Contents
Introduction Applications of the Captcha Definition and Types of Captcha Text based Captcha Image based Captcha Audio based Captcha Conclusion References

INTRODUCTION
CAPTCHAs: Completely Automated Public Turing test to tell Computers and Humans Apart With an increasing number of free services on the internet, we find a pronounced need to protect these services from abuse. Automated programs (often referred to as bots) have been designed to attack a variety of services. For example, attacks are common on free email providers to acquire accounts

APPLICATIONS WHERE THE CAPTCHA IS USED


Free email services Online polls Dictionary attacks Newsgroups, Blogs, etc SPAM

Definition & Types of CAPTCHA


A CAPTCHA is a cryptographic protocol whose underlying hardness assumption is based on an AI problem. TYPES OF CATCHA: 1. Text based CAPTCHA 2. Image based CAPTCHA 3. Audio based CAPTCHA

Types of CAPTCHA

TEXT BASED CAPTCHA


In the text based captcha the testing is done by giving the data in the distorted manner and making that text more complex so that the user cant easily find the text. The text also consists of some noise such as background, the contrast colors that make the text invisible clearly

Most text based CAPTCHAs have been broken by software


OCR Segmentation

Generate CAPTCHA Align CAPTCHA Cut CAPTCHA Transform CAPTCHA Decode CAPTCHA

Process of decoding the CAPTCHA

Image based CAPTCHA


To address this, numerous alternate CAPTCHAs (including image based ones) have been proposed. In designing a new CAPTCHA, the basic tenets for creating a CAPTCHA should be kept in mind: 1. Easy for most people to solve 2. Difficult for automated bots to solve 3. Easy to generate and evaluate

In the CAPTCHA we propose, we are careful not to provide the user with a small set of images to compare. Any similarity computation must be done against the entire set of images possible without any a priori filtering clues given. The success of our CAPTCHA rests on the fact that orienting an image is an AI-hard problem. In the next section, we will review the many systems that attempt to determine an images upright orientation. Although a few systems achieve success, their success is, when tested in realistic scenarios, limited to a small subset of image types

1. Detecting the orientation:


The classes of images that are easily oriented by computers are explicitly handled in our system. A detailed examination of a recent machine learning approach is given below.It is incorporated in our system to ensure that the chosen images are difficult for computers to solve

2.

LEARNING IMAGE ORIENTATION:


The particular machine learning tools and features used make this orientation detection system distinct, the overall architecture is typical of many current systems. When the orientation detection system receives an image, it computes a number of simple transformations on the image, yielding 15 single-channel images: 1-3: Red, Green, Blue (R, G, B) Channels. 4-6: Y, I, Q (transformation of R, G, B) Channels. 7-9: Normalized version of R, G, B (linearly scaled to span 0-255). 10-12: Normalized versions of Y, I, Q (linearly scaled to span 0-255). 13: Intensity (simple average of R, G, B). 14: Horizontal edge image computed from intensity. 15: Vertical edge image computed from intensity

Audio based CAPTCHA


Because visually impaired users who surf the Web using screen-reading programs cannot see this type of CAPTCHA, audio CAPTCHAs were created. Typical audio CAPTCHAs consist of one or several speakers saying letters or digits at randomly spaced intervals. A user must correctly identify the digits or characters spoken in the audio file to pass the CAPTCHA. To make this test difficult for current computer systems, specifically automatic speech recognition (ASR) programs, background noise is injected into the audio files.

In early March 2008, concurrent to our work, the blog of winter core Labs claimed to have successfully broken the Google audio CAPTCHA. After reading their Web article and viewing the video of how they solve the CAPTCHAs, we are unconvinced that the process is entirely automatic, and it is unclear what their exact pass rate is. Because we are unable to find any formal technical analysis of this program, we can neither be sure of its accuracy nor the extent of its automation. So this lead to the process of the following events: 1.Creation of the training data. 2.Classifier construction 3.Assessment of the current audio Captchas 4.Recommendation for the stronger Captcha

Conclusion
We have succeeded in breaking three different types of widely used audio CAPTCHAs, even though these were developed with the purpose of defeating attacks by machine learning techniques. We believe our results can be improved by selecting optimal segment sizes, but that is unnecessary given our already high success rate.

For our experiments, segment sizes were not chosen in a special way; occasionally yielding results in which a segment only contained half of a word, causing our prediction to contain that particular word twice. We also believe that the AdaBoost results can be improved, particularly for the Digg audio CAPTCHAs, by ensuring that the number of negative training samples is closer to the number of positive training samples. We have shown that our approach is successful and can be used with many different audio CAPTCHAs that contain small finite vocabularies

QUERIES??????

Anda mungkin juga menyukai