You are on page 1of 16

Breaking Voice Based Captcha

Group Members :
Harshal R Joshi
Kushal Kamra
Mayur Gahiwad

Internal Guide: Prof. M. B. Jhade


Index
1. What is Captcha
2. Types of Captcha
3. Voice based Captcha
4. Comparison between existing Captcha and new tool
5. Architectural diagram
6. Voice Activity Detection
7. VAD algorithm
8. Speech to Text conversion
9. Overall digit recognition system
10.Dynamic Time Warping
11. Application and Future enhancement

Breaking Voice Based Captcha 2


What is ?

• Completely Automated Public Turing Test to tell


Computers and Humans Apart

• Tests that humans can pass but computers


cannot
• Ensures that the response is not generated by
a computer

Breaking Voice Based Captcha 3


Types of …

• Text based…

• Image
based…

• Voice Based…

Breaking Voice Based Captcha 4


Voice based

• Provides new dimension to the concept of Captcha

• Advancement over text and image captchas – Invented in 2006

• Text and image captchas already conquered

Aim: To design a testing tool to try and break the voice


based Captcha. Pre-recorded voice samples will be used
for the purpose.

Breaking Voice Based Captcha 5


Comparison between existing Captcha and New Breaking Tool

Currently Text based and image based Captcha are available.

In text based Captcha we have to identify the distorted text and to retype the same.
However, In the image based Captcha we have to identify the the common object
from collection of picture.

In our project, we are making the breaking Tool that recognize the voice and
performs the speech to text conversion.

Breaking Voice Based Captcha 6


Architecture Diagram
Fast Voice
I/P Voice
Calculate
Fourier Activity
SNR
Transform Detection

Yes
SNR
MMSE <
5 dB
Clean
Speech
Spectral No

Subtraction

Speech
To Converted
Text Text
Converter
Breaking Voice Based Captcha 7
Voice Activity Detection (VAD)
• The process of separating conversational speech from
silence, music, noise or other non-speech signals.

• Primary Function:
Provide an indication of the presence of
speech in order to facilitate speech processing as well as
possibly providing delimiters for the beginning and end of a
speech segment.

Breaking Voice Based Captcha 8


VAD Algorithm…
1. Spectral distance voice activity detector is used.

2. Spectral distance threshold is decided.

3. If the Spectral Distance of the segment is less than the


threshold then noise flag is to 1 indicating noise segment.

4. A noise counter is maintained to keep track of immediate


previous noise frames.

5. If this counter is greater than some threshold(hangover) then


the entire segment is treated as silence segment else it is
treated as speech segment.

Breaking Voice Based Captcha 9


Speech to Text Conversion
• The sound is sampled, or digitized, by taking precise
measurements of the wave at frequent intervals.

• The system filters the digitized sound to remove unwanted noise,


and separates it into different bands of frequency.

• Next the signal is divided into small segments as short as a few


hundredths of a second.

• The program then matches these segments to known phonemes


in the appropriate language.

• It runs the contextual phoneme plot through a complex statistical


model and determines what the user was probably saying and
outputs it as text.

Breaking Voice Based Captcha 10


Breaking Voice Based Captcha 11
Overall Digit Recognition System

Breaking Voice Based Captcha 12


Dynamic Time Warping (DTW)

• DTW is an algorithm for measuring similarity between two


sequences which may vary in time or speed.

• It allows a computer to find an optimal match between two given


sequences (e.g. time series) with certain restrictions.

• The sequences are "warped" non-linearly in the time dimension to


determine a measure of their similarity independent of certain non-
linear variations in the time dimension.

Breaking Voice Based Captcha 13


Applications

 To test the vulnerability of the website in order to make more


robust Captcha.

Noise Reduction-
 To reduce noise in wireless communication
Speech to Text conversion-
 Security
 Voice Calculator
 To help disabled persons
Future Enhancement
To recognized Word System using Markov model

Breaking Voice Based Captcha 14


References:

• Digital Speech Processing – L.R. Rabiner, R.W. Schafer

• What is Fast Fourier Transform - By William T. Cochran,


James W. Cooley

• Single channel Noise Reduction algorithm for Hands free


Operation in Distorted Environments - By Stefan Schmitt ,
Malte Sandrock

• Spectral Subtraction Basics – Steven F. Boll

• MMSE – Ephraim, Malah

• IEEE Papers

Breaking Voice Based Captcha 15


THANK YOU!!!

Breaking Voice Based Captcha 16

You might also like