Anda di halaman 1dari 34

CAPTCHA TECHNOLOGY

What humans can do, But computers can not.

By, Airpula Vikas. 08w91a0549

CAPTCHA, the Acronym


Completely Automated Public Turing Test to Tell Computers and Humans Apart

CAPTCHA literal meaning


Completely--- Whole Automated--- made by machine Public--- universally known also easy for hackers to break it Turing Test to Tell--- test presented by Alan Turing Computers and Humans Apart

CAPTCHA Origins
1997: Andrei Broder at AltaVista wanted to prevent bots from automatically submitting sites for indexing He decided to add a test to the submission page He reversed Brother scanner OCR optimization techniques 2000: Luis von Ahn, Manuel Blum & John Langford at CMU coined term CAPTCHA

CAPTCHA: Deciding Human or Bot?


A puzzle or problem that is easy for humans to solve and very difficult for computers If the puzzle is solved correctly, you are considered human and can continue

Basic two types


Printed CAPTCHA

H-CAPTCHA

Printed CAPTCHA
Printed

CAPTCHA is difficult to break Lots of algorithms are available to generate these Humans cannot identify these very easily Two major types are there viz. Baffle text,Pessimal print.

Baffle Text image


Developed by Monica Chew and Henry Baird Uses pronounceable English characters with masking that are not present in English dictionary

Pessimal Print Image


Developed by Allison Coates and Henry Baird and Richard Fateman Uses the degradation model simulating physical defects caused by printing and scanning of printed text

Handwritten CAPTCHA
less frequently used because human can easily identify the handwriting rather than text images Use of transformations by adding lines,arcs,circles etc.

Example showing H-CAPTCHA

Types of Printed CAPTCHA


GIMPY BONGO PIX KittenAuth Face Recognition Audio Logic Puzzles

GIMPY
Randomly chooses 7 words from a dictionary Distorts the words using a variety of techniques Human must correctly type 3 of the words to pass the test In the real world, most applications only test for a single word (EZGimpy)

GIMPY Examples

EZ-GYMPY

R-GIMPY

BONGO
A visual recognition problem Two sets of shapes with a distinguishing characteristic Must choose which set the shape belongs to

PIX
A database of labeled images of recognizable objects Randomly chooses an object and displays N pictures of it Must correctly identify the object Pictures are distorted

KittenAuth
The Cutest Human Test A 3x3 matrix of cute animals Choose the 3 kittens Strategy is to use animals that look similar to kittens

Face Recognition CAPTCHA

Audio CAPTCHA
Pick a word or a sequence of numbers at random Render them into an audio clip using a TTS software Distort the audio clip Ask the user to identify and type the word or numbers

Logic Puzzles
Easy trivia questions Example: Which of the following is a bird? Elephant, Tiger or Robin,Cons

Difficult to create a big enough database of these questions Difficult for ESL users / international users

Breaking CAPTCHA

Most text based CAPTCHAs have been broken by software


OCR Segmentation

Other CAPTCHAs were broken by streaming the tests for unsuspecting users to solve.

Uses of CAPTCHA
Online polls Free e-mail services Search engine bots Prevention to Worms and spams Preventing dictionary attack etc.

Properties

CAPTCHA should be automatically generated and graded Test can be taken quickly and easily by human users Test will accept virtually all human users and reject software agents Test will resist automatic attack for many years despite the technology advances and prior knowledge of algorithms

Free Email Registration


Hotmail Registration

Yahoo! Registration

Final Thoughts
They

are crucial to preventing bot attacks Hopefully, they will become more user-friendly to people with disabilities (visual, mental) CAPTCHAs are mainly produced from AJAX and PHP technology Various algorithms are present Use of XML

Different CAPTCHAs

PHP
PHP originally known as Personal Home Page Its a Hypertext Preprocessor It is a scripting lang. Used to create dynamic web pages. With syntax from C,JAVA,perl etc PHP code is embedded within HTML pages for server side execution.

OCR
(Optical Character Recognition) The machine recognition of printed characters. OCR systems can recognize many different OCR fonts, as well as typewriter and computer-printed characters. Advanced OCR systems can recognize hand printing. When a text document is scanned into the computer, it is turned into a bitmap, which is a picture of the text. OCR software analyzes the light and dark areas of the bitmap in order to identify each alphabetic letter and numeric digit. When it recognizes a character, it converts it into ASCII text. Hand printing is much more difficult to analyze than machine-printed characters. Old, worn and smudged documents are also difficult. Scanning documents and processing them with OCR is sometimes as much an art as it is a

OCR

Segmentation
It is nothing but Image Processing Pixel based Segmentation Model based Segmentation Multi-scale Segmentation Semi-automatic Segmentation

Validators
Types of validators : 1) Mark up : checks web documents in format like HTML,XHTML etc. 2) Link validator : checks hyperlinks,useful to find broken links 3) CSS validator : checks stylesheet 4) RDF validator : checks RDF documents 5) Feed validator 6) P3P validator : related to protocols Etc.

Session Management
Process of keeping tracks of users activity across the sessions of interaction of user with comp sys. When user opens some web pages and does not do anything on that, session gets xpired. E.g : score watch on web site So after certain time when user re-login to the page then previously xpired session gets restored. E.g: if user opened yahoo acc in two windows, and after some time he\ she logged off from one window.then user cannot use same acc from other window, session gets xpired. User have to re-login to acc.

Session Management
There are types : 1) Desktop management 2) Browser management

Mainly useful for web applications

Anda mungkin juga menyukai