Low Energy on Array Microphones

Is there a way to increase the sensitivity of the array microphones? Unless I am literally inches away from the mics, the signal levels I am seeing in captured speech are very low. I can multiply the signal I receive by a scaling factor, but if possible it would be better to capture a signal at higher sensitivity.

I’m working with a modified version of MicrophoneArray with the beam forming removed, so I have direct access to the wishbone bus for communication.

1 Like

Same problem here! Any solution?

The most important thing I’ve found so far is to make sure that only one program using the HAL is running at a time. Running malos and a program that uses the HAL at the same time introduces enough noise into the audio signals to make them unusable.

I’ve gotten a faint, but usable speech signal from about 5 feet away by multiplying the data coming off the pins by 128. There is noise, but a lot of it is low frequency.

I’d still really like to know if this hardware has a gain that can be set. Also whether it is capable of sampling at rates other than 16 kHz.

Hi @MJMetzger,

I am happy to say that yes, we are working on taking the sample rate to at least 44k. I don’t have date yet but we will announce it here in the community. However we think for voice applications 16k is a fair option.
Regarding gain, from the software layers its is already implemented. But from what I read you are using our software at the lowest level (HAL) and you want to increase the sensibility (related to bit depth I assumed). Right now the current HAL implementation is receiving just 16 bit depth from the FPGA, but we are working in a 24bit solution.

Hope to have answered you questions,
Please give us feedback about your audio tests with the Matrix Creator so we can help you.

Yoel

Hi, I have same problem of MJMetzger. Is there a way to increase the analog gain of the microphones?

Hi @f.daniele,

How are you getting the audio. Are you using lower layer (HAL or MALOS ) like @MJMetzger? If that is the case you can use the implemented gain .

In HAL:

`void SetGain(int16_t gain) { gain_ = gain; }`

found in: https://github.com/matrix-io/matrix-creator-hal/blob/master/cpp/driver/microphone_array.h#L41

From MALOS:

// setup gain for all microphones
micarray_cfg.set_gain(8)

here: https://github.com/matrix-io/matrix-creator-malos/blob/master/src/js_test/test_micarray.js

Yoel

@yoelrc88,

Thanks for the response. I agree that 16 kHz is going to be sufficient for most voice applications, but having access to higher frequency data would give some additional options in terms of trying to remove some of the noise from the signal. The 24 bit data could also be helpful, depending on how you’re converting from 24 to 16 bit now.

I’ve gotten some more experience with Mics like these from other sources since I first asked the question. The low energy and the need to amplify the signal is typical. The other solutions I’ve seen have had much less noise, so they could accept more amplification before performance started to break down.

While it is fine to run a SINGLE microphone at 16 KHz to capture speech, if you intend to do any processing of the microphone ARRAY, then a higher rate will yield dramatically improved array functionality.

For that reason, I recommend running the array at a 48 KHz sampling rate.

I’m also assuming that all microphone array processing would be done on the FPGA: The RasPi would only need to handle a single stream filtered to the voice band.

There are other reasons to sample faster than 16 KHz, mostly to detect and reject both noise and echoes. Without external acoustic filtering to isolate the speech bands, the Nyquist-Shannon Sampling Theorem says that frequencies above and below the sample bandwidth will wrap into the sample bandwidth (aliasing). The wider you sample, the easier it will be to remove non-speech sounds (music, noise, etc.).

The best approach is to sample fast, perform direction finding on the raw data (after optional echo suppression), de-correlate to get isolated beams, then filter to yield the desired voice band, which would then be fed to a local wake-phrase recognizer and/or a cloud or local continuous speech recognizer.

There is no need for the RasPi to have to deal with anything more than a single 16 KHz sound channel coming out of MATRIX Creator FPGA (unless the user wants to, of course), but the MATRIX Creator FPGA will internally need a 48 KHz channel from each microphone to do decent microphone array processing.

Another way to look at it is by inspecting the relative physical dimensions of the array and a sound sample. The array is about 10 cm in diameter. At 48 KHz a sound sample is about .7 cm long (speed of sound / sample rate). That’s about 1/14 of the diameter, very useful for angular determination and inverse beamforming.

But at 16 KHz a sample would be 4 times longer, about 1/3 of the diameter, making beam processing very much sloppier. The extracted beams would contain sound from a much wider cone, leading to more speech recognition failures.

1 Like

@yoelrc88,

thanks for reply. Actually I’m using HAL and I set the gain in the way you suggested, but this means increasing also the noise. The SetGain method simply multiply the row signals by a constant:

from : https://github.com/matrix-io/matrix-creator-hal/blob/master/cpp/driver/microphone_array.cpp#L72

delayed_data_[s * kMicrophoneChannels + c] =
            fifos_[c].PushPop(raw_data_[s * kMicrophoneChannels + c]) * gain_;

so, there is an alternative way to record an acustic source, placed at a certain distance from the Matrix, exploiting the full dynamic range of a 16-bit audio file?