How to record with PyAudio

Hi there,

After overcoming some troubles I’ve finally got the mic tests working and arecord works fine (although there’s background noise). However I couldn’t figure out how to get PyAudio to work properly.

My record devices:
pi@raspberrypi:~ $ python3 -m sounddevice
0 bcm2835 ALSA: - (hw:0,0), ALSA (0 in, 2 out)
1 bcm2835 ALSA: IEC958/HDMI (hw:0,1), ALSA (0 in, 2 out)
2 Dummy: PCM (hw:1,0), ALSA (2 in, 2 out)
3 sysdefault, ALSA (128 in, 128 out)
4 pulse, ALSA (32 in, 32 out)
5 sc, ALSA (2 in, 2 out)
6 mic_channel0, ALSA (2 in, 2 out)
7 mic_channel1, ALSA (2 in, 2 out)
8 mic_channel2, ALSA (2 in, 2 out)
9 mic_channel3, ALSA (2 in, 2 out)
10 mic_channel4, ALSA (2 in, 2 out)
11 mic_channel5, ALSA (2 in, 2 out)
12 mic_channel6, ALSA (2 in, 2 out)
13 mic_channel7, ALSA (2 in, 2 out)
14 mic_channel8, ALSA (2 in, 2 out)
15 dmix, ALSA (0 in, 2 out)

  • 16 default, ALSA (32 in, 32 out)

My python code:

import pyaudio
CHUNK = 1024
p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paInt16,
            channels = 1,
            rate = 16000,
            input = True,
            input_device_index = 6,
            frames_per_buffer=CHUNK)
while True:
    data = stream.read(CHUNK)
    print(len(data))

It always crashes after reading a few frames. Output’s like:

pi@raspberrypi:~ $ python3 pyaudiotest.py
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.front
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.iec958
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
2048
2048
ALSA lib pcm_file.c:358:(snd_pcm_file_write_bytes) write failed: Bad file descriptor
ALSA lib pcm_file.c:358:(snd_pcm_file_write_bytes) write failed: Bad file descriptor
2048
ALSA lib pcm_file.c:358:(snd_pcm_file_write_bytes) write failed: Bad file descriptor
ALSA lib pcm_file.c:358:(snd_pcm_file_write_bytes) write failed: Bad file descriptor
python3: pcm_file.c:397: snd_pcm_file_add_frames: Assertion `file->wbuf_used_bytes < file->wbuf_size_bytes’ failed.
Aborted

Any help would be greatly appreciated. Thanks!

Still stuck ;-). Not sure if this is an Alsa or MC issue, but are MC devs going to provide working python sample that uses the mic array?

Same here. I have been struggling for a couple of days now and can’t get pyaudio working properly with the matrix creator. Would love some python samples and docs.

I’ve finally dropped dependency on PyAudio (thus portaudio I guess), and changed to spawning arecord process and consuming frames from piped stdout. It’s awkward but works. Also got snowboy hotword detector to work in the same way.

Can you please post your code for snowboy hotword detection?
I am working with snowboy hotword detection and the python speech_recognition package. Both work with PyAudio and had trouble configuring them.
I would like to see how I could get these two working with the arecord process and then maybe stream sound from the mic to google cloud speech.
Thanks

I promised snowboy’s owner that I’ll commit a python sample there. I hope I’ll get time to do it today. Will share a link here then ;-).

That sound great! Looking forward to it :slight_smile:

Hi, I’ve sent the PR for the arecord based python example. You can take a look:
https://github.com/Kitt-AI/snowboy/pull/120

1 Like

Thanks! Will definetely give it a try

Hi @duoduo999 thanks so much for the github code.
I tested your pull request on a Matrix Creator.

I renamed your file to snowboydecoder.py so could test using demo.py and demo2.py in the same directory.

Upon running the demos I get:

Listening... Press Ctrl+C to exit
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/pi/snowboy/examples/Python/snowboydecoder.py", line 115, in record_proc
    wav = wave.open(process.stdout, 'rb')
  File "/usr/lib/python2.7/wave.py", line 509, in open
    return Wave_read(f)
  File "/usr/lib/python2.7/wave.py", line 164, in __init__
    self.initfp(f)
  File "/usr/lib/python2.7/wave.py", line 129, in initfp
    self._file = Chunk(file, bigendian = 0)
  File "/usr/lib/python2.7/chunk.py", line 63, in __init__
    raise EOFError
EOFError

I am sure I am doing something wrong. How did you setup and test?

To make sure it wasn’t something with my base system, I verified am able to successfully record audio from the commandline with arecord --device=mic_channel1 -q -r 16000 -f S16_LE test.wav.

I notice on line 113 of your PR the “arecord” cmd has no “--device” specified? Should there be? My commandline tests don’t run without it. Or are you using a config file or patch that somehow uses the matrix microphone array. Somehow using the array would be ideal.

I didn’t want to comment directly in github without these clarifications first, since some of these issues may be matrix.one specific, and in order to not slow down the PR if so.

Thanks and please advise,
Marc

HI marc,

I didn’t include “–device” parameter in the PR because I didn’t want it to be bound to Matrix Creator. In my local environment I’ve set mic_channel8 to be default recording device so I didn’t need to pass it in. You can try adding that switch in python code and see if the problem can be resolved.

Btw, I’ve only tested with python3, although I think python2 should work too. Let me know if you still run into problems.

Hi everyone,

Just wanted to let you know that PortAudio support is on our roadmap.

Thanks!
Sean

Thanks! Can’t wait to see it.

has there been any update on this?

Is this roadmap public somehow? I am struggling too with my initial setup using PyAudio.

I declared mic8 to be the defaut one for recording, arecord works well.

How i would proceed to replace the well known PyAudio code snippets to record using your method? Thank you.

@duoduo999: Hello,

can you please share your code for replacing portaudio / pyaudio replacement which you mentioned above? I am not interested in the snowboy part yet.

Thank you

Okay, seems to be part of your Snowboy pull request here https://github.com/Kitt-AI/snowboy/pull/120/commits/9549d8412043973ccf61ecc2687ef9c38716faf0

Is this code suitable to detect complete sentences and corresponding audio silence or does it focus on single word recording?

It only changed the recording approach but didn’t modify how snowboy works
… I guess snowboy just listens on continuous stream of frames and does
its detection.

I hope Matrix Labs will support PyAudio soon as it is the Python quasi standard.

We do not have any feedback when this might happen. Therefore I am currently searching the internet for software (and hardware) alternatives. Maybe gstreamer can do the job, don’t know. Maybe we can utilize the nodejs micarray code within Python using subprocess and threading as you did.

2 Likes