Category Archives: pocketsphinx

Speech recognition on Raspberry Pi with Sphinx, Racket and Arduino

IMG_20131112_112628 copyIn this post I put together a number of things to control two LED from a Raspberry Pi with voice recognition (via Sphinx), Firmata and Arduino. Before you start, you may want to have a look at this other post on how to connect a Raspberry Pi and an Arduino board using Firmata and Racket: http://jura.mdx.ac.uk/mdxracket/.

First of all, we need to install PocketSphinx on Raspberry Pi to do speech recognition. I am using a standard USB camera with microphone (supported by Raspberry Pi) and I’m following the instructions available here: https://sites.google.com/site/observing/Home/speech-recognition-with-the-raspberry-pi. In essence, this is what I’ve done (as root on the Raspberry Pi), please see the link above for additional details:

apt-get install rpi-update
apt-get install git-core
rpi-update

-> Connect your USB microphone (or camera+mic) and
-> reboot the RPi at this point

vi /etc/modprobe.d/alsa-base.conf 

# change as follows:
# Comment this line
# options snd-usb-audio index=-2
# and add the following:
options snd-usb-audio index=0

-> close the file and reload alsa:

alsa force-reload

wget http://sourceforge.net/projects/cmusphinx/files/sphinxbase/\
0.8/sphinxbase-0.8.tar.gz/download
mv download sphinxbase-0.8.tar.gz
wget http://sourceforge.net/projects/cmusphinx/files/\
pocketsphinx/0.8/pocketsphinx-0.8.tar.gz/download
mv download pocketsphinx-0.8.tar.gz
tar -xzvf sphinxbase-0.8.tar.gz
tar -xzvf pocketsphinx-0.8.tar.gz

apt-get install bison
apt-get install libasound2-dev

cd sphinxbase-0.8
./configure --enable-fixed
make
make install

cd ../pocketsphinx-0.8/
./configure
make
sudo make install

Et voila’, you are now ready to test your PocketSphinx installation. Go to pocketsphinx-0.8/src/programs and run:

./pocketsphinx_continuous

If you are lucky, you should get some text back… I don’t have space here (and time) to go into the details of (pocket)sphinx. Try some simple words and see if they are recognised. I ended up building a very very simple language model using the on-line tool at this link: http://www.speech.cs.cmu.edu/tools/lmtool-new.html. I use just “green”, “red” and “off” and they seem to work fine in spite of my Italian accent.

In the next step we need to connect the output of PocketSphinx with Racket. I do this in a very primitive way: I modify the source code of pocketsphinx_continous to output just the word that is recognised. This is very simple: just modify continous.c under pocketsphinx-0.8/src/programs, comment all the printf statement and output just the recognised word (drop me an email if you don’t know how to do this). I append a “RACKET: ” string at the beginning of the printed string to make sure that this is something I have generated. You can then run pocketsphinx and redirect the output to a file with something like:

./pocketsphinx_continuous -lm /home/pi/sphinx/simple/4867.lm \
   -dict /home/pi/sphinx/simple/4867.dic > /tmp/capture.txt

(notice that I’m using the language model + dictionary generated on-line)

Time now to go back to Racket. I assume you know how to connect a Raspberry Pi with an Arduino board and talk to it in Racket using Firmata (if this is not the case, please have a look at the instructions available at http://jura.mdx.ac.uk/mdxracket/index.php/Raspberry_Pi,_Arduino_and_Firmata). In the following Racket file I simply read the file /tmp/capture.txt and send instructions to the board according to the instructions received. If a command is not recognised, I print a message on screen. The code for this is the following:

#lang racket
(require "firmata.rkt")
 
(define green 12)
(define red 13)
 
(define in (open-input-file "/tmp/capture.txt"))
 
(define (process-input str)
  (printf "processing input ~a\n" str)
  (set! str (substring str 8))
  (cond ( (string=? (string-upcase str) "RED")
          (printf "I'm setting red\n")
          (set-arduino-pin! red)
          )
        ( (string=? (string-upcase str) "GREEN")
          (printf "I'm setting green\n")
          (set-arduino-pin! green)
          )
        ( (string=? (string-upcase str) "OFF")
          (printf "I'm clearing the PINs\n")
          (clear-arduino-pin! red)
          (clear-arduino-pin! green)
          )
        (else
         (printf "Sorry I cannot understand: ~a\n" str)
         (flush-output)
         )
  )
  )
 
(define (read-loop)
  (define str (read-line in))
  (unless (eof-object? str)
               (process-input str)
    )
  (read-loop))
 
(define (start-everything)
  (open-firmata "/dev/ttyACM0")
  (set-pin-mode! green OUTPUT_MODE)
  (set-pin-mode! red OUTPUT_MODE)
  (read-loop)
  )
 
(start-everything)

Job done. Now check that:

  • The modified version of pocketsphinx_continuous is running and redirecting the output to /tmp/capture.txt
  • Launch the file above with something like racket sphinx-arduino.rkt
  • Check that your Arduino board is connected and wired up appropriately

and you should get something like this:

http://youtu.be/XGYNRHWY4Ag

Future work:

  • I don’t think there is a need to write to a file… maybe pocketsphinx can redirect to a port and Racket can listen to it?
  • Improve the language model for a domain of your choice
  • Add a couple of speakers to the Raspberry Pi so that Racket can tell you what it is doing, if something has not been recognised, etc.