Doing on-board word/phrase detection


#1

Was wondering if it would be possible to use the Matrix Creator to have an always-listening device that responds to a wake-up word or phrase. If so what parts of the Matrix Creator could I take advantage of?


#2

I have been looking into this a well. I am having trouble finding an open-source hardware or software solution for a low-power voice trigger for battery powered IoT devices/bots/toys.

The Sensory software looks great because it can be loaded on to many ICs and DSPs and has a ton of features but it’s proprietary.

I’ve tried searching github but can’t seem to find anything close to the Sensory software.

I would really like to figure this out also…anyone?

Sensory Software


ICs Using Sensory
http://www.bdti.com/InsideDSP/2013/04/11/Sensory *** good info here ***
http://iotdesign.embedded-computing.com/2783-voice-powered-internet-of-things/
http://ip.cadence.com/news/406/330/Tensilica-Introduces-the-Smallest-Lowest-Power-DSP-IP-Core-For-Always-Listening-Voice-Trigger-and-Voice-Recognition

Quantimetrica
http://www.quantimetrica.com/page_about1.html
http://quantimetrica.com/files/97177664.pdf
http://www.quantimetrica.com/files/brochure.pdf?

DSP Group

BelaSigna R281

Rubidium
http://www.rubidium.com/always-on-voice-trigger/


#3

I found a pretty neat library that uses deep neural networks so there is some training involved but it’s on github and free to try.

https://snowboy.kitt.ai


#4

Thanks for the link. That’s exactly what I was looking for.


#5

Great work. Let us know what you come up with! - Team MATRIX


#6

@ben @sysctl not sure if you’re looking to stream out and get a response back, vs. a localized word/phrase detection. For cloud-based solution you can also go with the following to do the “always listening” portion. These respond with intents which can then parsed for keywords.

Wit.ai Speech API
Google Speech API

For the localized keyword-detection, it looks like snowboy.kitt.ai has a universal model to start so you don’t need to train your own. The personalized model is more forward a user-tailored model based on the documentation <-- if the recording device is the listening device, you’ll get better results in the case of a personalized model.

Notes from snowboy.kitt.ai:

Models with suffix pmdl are personal models, and they are supposed to only work well for the person who provides the audio samples. If you are looking for a model that works well for everyone, please use the universal model (with suffix umdl).

Hope this helps a bit more. Please let us know how it goes!


#7

Hi,

I have just received my Matrix Creator.
I work on a similar project.
I was believed that the mics will be recognized by the Pi3 as an audio record / input device on the alsamixerr so i will be able to use it with any programs.

You can have a look on this project who ha already implemented Snowboy https://github.com/alexylem/jarvis/wiki
It’s in french but the guy speak English and the interface and settings have been built in English. It handles Google, Bing, snowboy, espeak, pico, … It works pretty good but needs good microphone support for Pi to handle ambiance sound,(TV, news channel,…) So it should be able to know when the user finish to say his command
For the moment i have just tried micarray_recorder demo to “listen” the sound quality which is pretty good.
I compared with alexa demo with matrix creator so i suppose that it’s possibe combine the 8 mics to one wav file instead of 8 seperate raw files?

@ben and @sysctl can we combine our efforts to make it work?
A always listening system to catch the hotword then to tell a command for home automation, questions, …

Looking forward to play with mic through JavaScript instead of C++… for the moment i didn’t see any examples to access to mic layer through py or js… there is?

Thanks a lot… hope i can contribute for Matrix Creator


#8

Snowboy is a good place to start for “wake word” detection


#9

I’ve been trying to think of how Snowboy can be used here. As Snowboy uses portaudio we would most probably require an alsa driver for the 8 mics. Does anyone have any advice regarding how we can go about doing this? How can we make use of the SPI interface to work with Alsa? Is there a better way?


#10

Fully agree. It should be natively recognized by alsa. It seems that the project Respeaker will be able to handle that.
Hope It will be the same for the Matrix Creator


#11

Hi @LengZai,

We are testing some options for on-board word/phrase detection. Here is an example using Pocketsphinx. To test it you can go to https://github.com/matrix-io/matrix-creator-hal/tree/av/pocketsphinx_demo/demos.
Check out the example working ! https://youtu.be/30jzyISQrNE.

Give us your thoughts about this and share your projects too!.

Yoel


#12

I was hoping for something like this as well. Snowboy seems pretty perfect for on-device keyword recognition.

My hope is to make it mainly voice controlled, so hopefully some way of using Snowboy to activate infrared sensor input and output, to open other things, like Alexa, hopefully the hands free version of Alexa will get something for this device soon as well. The Pocketsphinxdemo that just got linked also seem to support having the sensors be used, to maybe give a warning if the humidity in the room is high or low.


#13

I have been trying to get the Pocketsphinxdemo to work, but for some reason cmake and make fails to compile the executables. Well, only the Pocketsphinxdemo executable, but that makes the whole branch change a bit pointless with everything else being out of date.


#14

@Caldor I know this is an old topic and I’m not sure if you got this working already, just in case anybody else runs into this error:

sudo apt-get install liblapack-dev liblapack3 ibopenblas-base libopenblas-dev

Also make sure that LD_LIBRARY_PATH is set

echo $LD_LIBRARY_PATH

if it’s empty, set it like this:

export LD_LIBRARY_PATH=/usr/local/lib/

This way I got pocketsphinx to start and it actually works. Now I need to figure out how to actually do something with it apart from the demo.


#15

Thanks :slight_smile: Yes it is still a problem. I will try this later.


#16

Hey everyone!

I’m new to the forums.

Just to be upfront, I work for Mycroft AI. I’ve been eyeing Matrix for a while and just haven’t gotten one yet.

Someone asked on our forums whether or not Matrix could work with Mycroft running on the Pi, since we have a Picroft image for download. I don’t see why not.

If anyone gets Mycroft working with the Matrix, let me know. I’m still waiting to get my hands on one.

I DID, however, preorder a Matrix Voice and plan on getting that working with Mycroft, so I’ll let you guys know how well that works out-of-the-box with Mycroft


#17

I’m told that the Matrix also works with ALSA and therefore Sonus (offline hot-word detection and streaming speech recognition):

I’m waiting on my Matrix voice, but if anyone on this thread has gotten the Matrix Creator to work with ALSA I’d be really interested in what you have to specify as your audio input device so I can add it to the documentation for Sonus :slight_smile:


#18

Hey Evan, I’m not using Sonus but for my Creator I need use this (using node-microphone):
FYI: mic_channel8 is the beamforming channel, availalbe via ALSA.

const options = {
“device”: “mic_channel8”
};

const Microphone = require(‘node-microphone’);
const mic = new Microphone(options);


#19

No, it doesn’t work as you will get the same Python root cause, search “Bad file descriptor” in the forum.


#20

Hey, I’m the CEO / CTO of Brighten.AI- and we have some tech you could use on Matrix- not just wake word, but full speech recognition you can run local on your board. Reply here if interested. We also run on linux / cloud / iOS / Android / Windows / osx