DOA, Slow down voice, voice to text


#1

Hi

I am an English born speaker living in Montreal, struggling to learn French. I can read it and understand what I am reading pretty well, and when spoken clearly and slowly I can also understand a fair amount. So, since everyone in Montreal speaks very quickly, and since I often need to go to work meetings where people speak in French (quickly),I have been thinking about buying a Matrix Voice and programming it to pick the dominant speaker at any given time (using DOA), slow the voice down, and send it via bluetooth to my android smart phone so I can listen in to the slowed down speech on a headset, and optionally use voice to text to display the text on the smart phone screen. FYI, I have been programming for almost 50 years and do know some C++ so I am guessing that it would be best for me to use the C++ programming capability.

As mentioned above, I would like to use the Matrix Voice. I would like it to be battery powered, and prefer (after programming is done) for it to not require a raspberry pi (because the battery drain for a raspberry pi is substantial).

Besides wanting suggestions as to how to proceed (assuming this project is feasible which I would also like feedback on), I have some specific questions:

  • How do I do DOA on the Matrix Voice without being connected to a raspberry pi (I saw a reference to running ODAS on the raspbery pi but I am presuming that reference will not be helpful for unning without the raspberry pi)

  • How do I then use the DOA to filter the digital output of the sound picked up by the Matrix Voice to the direction found by the DOA

  • How much internal memory does the ESP32 have for storing sound so that I can delay sending out the slowed down sound?

  • Does anyone know of any algorithms for effectively slowing down sound? I am guessing that if I sample at a high enough rate I can either slow down the playback of the sampled sound (which would effect the frequency) or I can repeat small sets of samples of sound (this seems a lot trickier)

  • What is the best way to send the digitized sound from the Matrix Voice via bluetooth to my android phone

  • How do I get the Matrix Voice to connect (pair) via bluetooth to my android phone?

  • How would I then get my android phone to use the bluetooth input to feed it to my headset?

  • What is the battery drain of the Matrix Voice when using bluetooth (which requires the microcontroller version of the Matrix Voice)

Any help/collaboration on this project would be very much appreciated.

Thanks . . .

Phil


#2

@philtroy,

Wow, this is a cool application idea!

Regarding your questions:

How do I do DOA on the Matrix Voice without being connected to a Raspberry Pi

  • Can’t be done currently, we don’t provided a way to do it and it seems pretty difficult to migrate ODAS (our current Pi example) to ESP32. (edited)

How do I then use the DOA to filter the digital output of the sound picked up by the Matrix Voice to the direction found by the DOA

  • Refer to ODAS documentation
  • If you are referring to filtering different speakers using the DOA form ODAS, ODAS should provide this functionality, haven’t used it though

How much internal memory does the ESP32 have for storing sound so that I can delay sending out the slowed down sound?

  • See the datasheet for info on ESP32. Memory section states:
    • 448 KB ROM
    • 520 KB SRAM

Does anyone know of any algorithms for effectively slowing down sound?

  • Some googling should help, this is something that is solved.

What is the best way to send the digitized sound from the Matrix Voice via bluetooth to my android phone

How would I then get my android phone to use the bluetooth input to feed it to my headset?

What is the battery drain of the Matrix Voice when using bluetooth (which requires the microcontroller version of the Matrix Voice)

  • Haven’t measured yet, this is a very specific question about ESP32 performance. You can probably find some info about Bluetooth radio energy consumption in the datasheet.

Hope this helps with starting!

Best,
MATRIX Team