Python or alternatives: Voice Activity Detection

As PyAudio / Portaudio is not available can anyone help out how to record continuously from micarray, recognize when someone starts speaking and when someone stops. The resulting audio should be dropped to a wav audio file.

I can imagine to use a subprocess outside of Python using JavaScript or C but i need Python Malos in parallel for handling LED stuff.

Thank you.

@Sphinx - Seems like we are in search of a similar solution, yet I’m just simply looking to capture audio WITHOUT having to use subprocess (either via alsaaudio, PyAudio or similar). Has no one experienced success with this yet?

1 Like

You can use pythons Alsaaudio module. I have successfully run it using a custom version of the Jasper Project. Take a look at there AlsaAudio Engine plugin for how they do it. There is a bug in their code though, their plugin always uses the ‘default’ device even when you set a custom one in the profile.yaml.
This is the line of code they use to create the audio stream from the device…

        pcm_type = alsaaudio.PCM_CAPTURE
        stream = alsaaudio.PCM(type=pcm_type,
                               mode=alsaaudio.PCM_NORMAL,
                               device='mic_channel8')

You just need to change the "device = " line to use the mic_channel* like I have there instead of ‘default’ like it is in their code. .

Edit: I should also mention I no longer use Alsa though and opted for the subprocess/arecord method as I had better results with it. Although I never compared actual audio samples to see if there was a difference.

2 Likes

… audio support for python is close guys :slight_smile:

thank for the support.

1 Like