Anda di halaman 1dari 23

Automation Lab - II

Project File Report - Speech to text conversion


(using Raspberry Pi 2 Model B)

Submitted to:-
Ms. Saloni Bhatia
Submitted by :-
M. Lokesh (131200036)
Mayank Narayan Sharma(131200032)
Vaibhav Garg(131200057)
Vivek Banerjee(131200060)
ECE, IIIrd Year
Table of Contents :-
1. Overview of the project

2. Introduction to Raspberry Pi 2

3. Linux installation on Raspberry Pi 2

4. Network setup and installation (using Ethernet cable and PuTTY - ssh
client)

5. Interfacing of webcam as an audio input device (checking and


verification)

6. Installing Linux packages

7. Creating a Language Model (Developing database for recognizable


words)

8. Running speech recognition locally on Raspberry Pi 2

9. Results
Overview of the Project

Our main intention was to develop a speech to text convertor and


we achieved this using a Raspberry Pi 2 with Linux (Wheezy distro)
installed on it. The whole Raspberry Pi interfacing had to be done from
scratch and it was very interesting to get to know about Raspberry Pi and
its various features available to us.

We achieved interfacing of the R-Pi with the Internet using an


Ethernet cable and also were able to do serial communication between
the PC and the R-Pi using PuTTY - ssh client. PuTTY works on SCP, SSH,
Telnet, rlogin, and TCP protocols. To connect to the Internet, we had to
reconfigure the default gateways, and also our static IP's.

The linux packages were installed and updated onto the R-Pi using
basic linux commands on the terminal. The webcam was interfaced with
the R-Pi and it is serving the primary purpose of the sound recording or
audio input device.

For speech to text conversion, we also had to formulate a language


model for recognizable words which would be displayed after the input
device listens to our voice. A simple text file was created with the words
and then uploaded to a software converter which produced a '.dic' file
(short for dictionary). This file was then integrated into the linux
architecture.
Introduction to Raspberry Pi 2 Model B
The Raspberry Pi is a series of credit cardsized single-board computers developed
in the United Kingdom by the Raspberry Pi Foundation with the intent to promote the
teaching of basic computer science in schools and developing countries. The hardware is the
same across all manufacturers. The firmware is closed-source.

Several generations of Raspberry Pi's have been released. The first generation (Pi 1)
was released in February 2012 in basic model A and a higher specification model B. A+ and
B+ models were released a year later. Raspberry Pi 2 model B was released in February 2015
and Raspberry Pi 3 model B in February 2016.

All models feature a Broadcom system on a chip (SOC) which include an ARM
compatible CPU and an on chip graphics processing unit GPU (aVideoCore IV). CPU speed
range from 700 MHz to 1.2 GHz for the Pi 3 and on board memory range from 256 MB to
1 GB RAM. Secure Digital SD cards are used to store the operating system and program
memory in either the SDHC or MicroSDHC sizes. Most boards have between one and four
USB slots, HDMI and composite video output, and a 3.5 mm phono jack for audio. Lower
level output is provided by a number of GPIO pins which support common protocols
like I2C. Some models have an RJ45 Ethernet port and the Pi 3 has on board WiFi 802.11n
and Bluetooth.

The Foundation provides Debian and Arch Linux ARM distributions for
download, and promotes Python as the main programming language, with support for BBC
BASIC, C,C++, Java, Perl, Ruby, and Squeak.

The Raspberry Pi hardware has evolved through several versions that feature
variations in memory capacity and peripheral-device support.

This block diagram depicts models A, B, A+, and B+. Model A, A+, and Zero lack
the Ethernet and USB hub components. The Ethernet adapter is connected to an additional
USB port. In model A and A+ the USB port is connected directly to the SoC. On
model B+ and later models the USB/Ethernet chip contains a five-point USB hub, of which
four ports are available, while model B only provides two. On the model Zero, the USB port
is also connected directly to the SoC, but it uses a micro USB (OTG) port.

Peripherals that can be attached to the Raspberry Pi 2 :


The Raspberry Pi may be operated with
1. A generic USB computer keyboard and
2. A generic USB computer mouse.
About Linux installation on Raspberry Pi 2

We installed Raspbian Wheezy (free Linux distro) on the Raspberry Pi 2 for


our project.

Features include:

A minimal Raspbian Wheezy installation (similar to a netinstall)


Hard Float binaries: floating point operations are done in hardware instead of
software emulation, that means higher performances
Disabled incremental updates, means apt-get update is much faster
Workaround for a kernel bug which hangs the Raspberry Pi under heavy
network/disk loads
3.6.11+ hardfp kernel with latest raspberry pi patches
Latest version of the firmwares
Fits 1GB SD cards
A very tiny 118MB image: even with a 2GB SD there is a lot of free space
ssh starts by default
The clock is automatically updated using ntp
IPv6 support
Just 14MB of ram usage after the boot

First we have to extract the image with p7zip: 7za x raspbian_wheezy_20130923.img.7z

Then a formatted SD card is used to flash the linux distribution.

Then flash it to your SD with dd: dd bs=1M if=raspbian_wheezy_20130923.img


of=/dev/sdX

Finally, if you have an sd larger than 1GB, grow the partition with gparted (first move the
swap partition at the end).

The root password is raspberry.

You will have to reconfigure your timezone after the first boot: dpkg-reconfigure tzdata

The keyboard layout: dpkg-reconfigure console-data

And the localization: dpkg-reconfigure locales


Network setup and installation
(using Ethernet cable which performs serial communication via Putty)

PuTTY

It is a free and open-source terminal emulator, serial console and network file transfer
application. It supports several network protocols, including SCP, SSH, Telnet, rlogin,
and raw socket connection. It can also connect to a serial port.

PuTTY was originally written for Microsoft Windows, but it has been ported to various
other operating systems.

The network communication layer supports IPv6, and the SSH protocol supports the
delayed compression scheme. It can also be used with local serial port connections.

PuTTY comes bundled with command-line SCP and SFTP clients, called "pscp" and
"psftp" respectively.

If we connect to the Raspberry Pi with SSH or a remote desktop application a lot, WiFi is
actually one of the slowest and least reliable ways to connect to the Internet. A direct ethernet
connection is much faster and a lot more stable. By connecting to the Pi directly from the
laptop or desktop with an ethernet cable we bypass our local network, and don't have to
sharing bandwidth with other computers on the same network. It also allows us to connect to
the Pi when were outside of our home network.

What were going to do is assign a static IP address to the ethernet port of the Pi. This address
will depend on the IP address of the ethernet adapter on the computer we will be connecting
to the Pi from.
Find your Ethernet Adapters IP Address

First, we need to find out the IP address of the ethernet adapter on the computer you will be
accessing the Pi from.

Access the Network Connections window by right clicking on the Windows icon in the task
bar (Windows 8), or through the Control Panel in earlier versions of Windows. Then right
click on the Ethernet connection and select Properties:

Scroll down the list and select Internet Protocol Version 4 (TCP/IPv4), then click the
Properties button:

If Use the following IP address is selected, take note of the IP address. In this case its
10.0.0.6.
If Obtain an IP address automatically is selected, we will need to find
your autoconfiguration IPv4 address usingipconfig in the Windows Command Prompt:

Find your Autoconfiguration IPv4 Address

Skip this step if you found the IP address of your ethernet adapter in the step above.

The autoconfiguration IPv4 address wont be displayed unless something is connected to the
ethernet adapter of your laptop/desktop computer, so plug your Pi into your computer with an
ethernet cable, and power it up.
Now we will need to access the Windows command prompt. Either search for it in the Start
menu, find it in the Control Panel, or right click on the Windows icon in the task bar:

Once you get to the command prompt, enter ipconfig:

Scroll down to see the configuration settings of your Ethernet adapter, which should say
something like Ethernet adapter Ethernet:

Find your Default Gateway IP

The next step is to find out your default gateway IP. This is the local IP address of our
network router. Computers on your network use it to communicate with the router and access
the internet.
Power up and log into your Raspberry Pi via WiFi or ethernet, then enter route '-ne' at the
command prompt to see the network routing information:

Under the Gateway column, we can see your Default Gateway IP (10.0.0.1) for each
interface (Iface) ethernet (eth0) and WiFi (wlan0). Write down the default gateway IP.

Find Your Static Domain Name Servers

Now we need to find out the IP addresses of the domain name servers the Pi uses to
find websites on the internet. We power up the Pi and log in to the command prompt, then
enter cat /etc/resolv.conf:

Copy these IP addresses to a text editor on the PC or write them down for later.

Configuring the Static IP

Now were ready to configure the network settings on the Pi and set up our static IP address.

Enter sudo nano /etc/dhcpcd.conf to edit the dhcpcd.conf file:


Now, add this code to the end of the /etc/dhcpcd.conf file, and change the IP address.

interface eth0
static ip_address=169.254.81.99
static routers=10.0.0.1
static domain_name_servers=75.75.75.75 75.75.76.76 2001:558:feed::1 2001:558:feed::2

The three IP address are created as follows:

o static ip_address= This will be the static IP address we use to SSH or remotely
connect to our Pi. We now take the IP address of our computers ethernet adapter (found
in the steps above), and change the last number to any other number between 0 and 255.
o static routers= This is the default gateway IP we found above.
o static domain_name_servers= These are the IPs we found in the /etc/resolv.conf file
above. Separate each IP with a single space.

The /etc/dhcpcd.conf file should look like this after we have put in our own IP addresses:
Note: We have also configured a static IP for my WiFi (wlan0) connection in the image
above.

After we have added the code and replaced the IP addresses, press Ctrl-X and Y to exit and
save the /etc/dhcpcd.conf file.

Now reboot the Pi, and plug an ethernet cable from the Pi directly to the laptop or desktop.
Open PuTTY (or another SSH client) and log in with the static ip_address address we have
created above:
Now we test to make sure that the Pi is able to access the internet by pinging Google.
Enter sudo ping www.google.com at the command prompt:

If the connection is successful, well see that packets have been sent and received. If the
connection is unsuccessful, we will get a Network is unreachable error:
Interfacing of webcam as an audio input device
cat /proc/asound/cards : will give the list of connected devices

After connecting webcam we can change the configuration of input devices by giving the
following commands :

alsamixer

We can change index of input device in the options and we can control the gain of the
devices.

Then we can start recording by giving following commands :

arecord -d 10 -D plughw:1,0 test.wav

Now we can verify the recorded audio by playing it back using the following cammand :

aplay test.wav
Installation of Linux Packages

Linux commands

sudo apt-get update

will update all the preinstalled packages

sudo apt-get upgrade

will upgrade the firmware


sudo apt-get install bison

Bison is a general-purpose parser generator that converts an annotated


context-free grammar into a deterministic LR or generalized LR (GLR)
parser employing LALR(1) parser tables. As an experimental feature, Bison
can also generate IELR(1) or canonical LR(1) parser tables. Once you are
proficient with Bison, you can use it to develop a wide range of language
parsers, from those used in simple desk calculators to complex
programming languages.

Bison is upward compatible with Yacc: all properly-written Yacc grammars


ought to work with Bison with no change. Anyone familiar with Yacc should
be able to use Bison with little trouble. You need to be fluent in C or C++
programming in order to use Bison. Java is also supported as an
experimental feature
sudo apt-get install libasound2-dev

This is a shared library for ALSA applications -- development files

This package contains files required for developing software that makes
use of libasound2, the ALSA library.

ALSA is the Advanced Linux Sound Architecture

sudo apt-get install swig


SWIG is a software development tool that connects programs written in C
and C++ with a variety of high-level programming languages. SWIG is
used with different types of target languages including common scripting
languages such as Javascript, Perl, PHP, Python, Tcl and Ruby. The list
of supported languages also includes non-scripting languages such as C#,
Common Lisp (CLISP, Allegro CL, CFFI, UFFI), D, Go language, Java
including Android, Lua, Modula-3, OCAML, Octave, Scilab and R. Also
several interpreted and compiled Scheme implementations (Guile,
MzScheme/Racket, Chicken) are supported. SWIG is most commonly used
to create high-level interpreted or compiled programming environments,
user interfaces, and as a tool for testing and prototyping C/C++ software.
SWIG is typically used to parse C/C++ interfaces and generate the 'glue
code' required for the above target languages to call into the C/C++ code.
SWIG can also export its parse tree in the form of XML and Lisp s-
expressions. SWIG is free software and the code that SWIG generates is
compatible with both commercial and non-commercial projects.

sudo apt-get install python-dev

header files and a static library for Python (default)


sudo apt-get install mplayer
will install media player related packages

sudo reboot

we have to restart the machine to apply the changes


Creating a Language Model

Create a text file, containing a list of words/sentences we want to be


recognized

Upload the text file here: http://www.speech.cs.cmu.edu/tools/lmtool-


new.html
and then download the generated Pronunciation Dictionary and Language
Model

Download the .dic(Pronunciation Dictionary) and .lm(Language Model)


Into raspi using wget command.

VERIFICATION
Now update library paths for installed libraries and packages

cd ~/
export LD_LIBRARY_PATH=/usr/local/lib
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

now run the following command to start the project


pocketsphinx_continuous -hmm /usr/local/share/pocketsphinx/model/en-
us/en-us -lm 3199.lm -dict 3199.dic -samprate 16000/8000/48000 -inmic
yes
change the underlined words as updated configurations
Speech Recognition Toolkit
CMU Sphinx a.k.a. PocketSphinx
Currently pocket sphinx 5 pre-alpha (2015-02-15) is the most recent
version. However, there are a few prerequisites that need to be installed
first.

Building Sphinxbase

cd ~/

go to root folder
wget
http://sourceforge.net/projects/cmusphinx/files/sphinxbase/5prealpha/sphi
nxbase-5prealpha.tar.gz

download the sphinxbase tarball


tar -zxvf ./sphinxbase-5prealpha.tar.gz

unzip the downloaded file


cd ./sphinxbase-5prealpha

change directory to sphinxbase-5prealpha


./configure --enable-fixed
make clean all
make check
sudo make install

will compile and install the downloaded packages

Building PocketSphinx

cd ~/ : go to root folder
wget
http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/5prealpha/p
ocketsphinx-5prealpha.tar.gz
download pocketsphinx tar ball using :

tar -zxvf pocketsphinx-5prealpha.tar.gz


unzip the downloaded file :

cd ./pocketsphinx-5prealpha

change directory to pocketsphinx-5prealpha


./configure
make clean all
make check
sudo make install

will compile and install the downloaded libraries

Results
Basic Setup of the Webcam and Raspberry Pi

Device starts listening

Anda mungkin juga menyukai