Rhasspy voice assistant

Hey,

i wonder if anybody has managed to set up a Matrix Voice (on a RPi) together with Rhasspy. Maybe you could share your setup in particular:

Have you got some tips for beginners, some further resources or tutorials, that you have used?

Thank you :smiley:

I have not, but please do share any experience, I’m currently working on this as well.

What I’m thinking so far is:

  • Wake word: Porcupine
    • Might change though, I can’t seem to get from their website if you can only the optimizer 5 times total, or have 5 different you can re-train as many times as you want.
  • Command listener: webrtcvad
  • Speech-to-text: Kaldi
    • Might move this to the server eventually.
  • Intent recognition: fsticuffs
    • Speed, baby!
  • Text-to-speech: PicoTTS
    • Would have used MaryTTS, but couldn’t find a docker image that works on armhf (RPi)

I’ll let you know if I get something to something usable. Please do the same, since I’m also pretty much just poking stuff with a stick, and seeing what happens at this point.

At the moment i’m struggeling to get my mic and especially my speakers setup. The strange thing is, that if i select the Matrix Voice device in the config and press save, it restarts. After the restart however the default device is selected (which i don’t know which one this is).
A even bigger problem is that my external soundcard isn’t even listed in “Audio”. :frowning:
Have you got any experience with that?

From what I can see, the microphone will always say Default Device after you save, but the changes should still be made. For the mics, I selected:

Use arecord directly (ALSA)

  • Input Device: hw:CARD=MATRIXIOSOUND,DEV=0: Direct hardware device without any conversions
  • Click test

Speaker worked for me when I plugged it into the audio jack.

Yes, that seems to work. Microphone and speaker (mini-jack) works, rhasspy sends MQTT to Home Assistant.

Now though, the RPi3B+ becomes unresponsive after a short while. The web interface works OK for a while, but testing the voice commands sort of freezes. And after a short while, even the SSH connection breaks.

@Carlos, do you have it (every component) running on a Raspberry Pi? And is it functional? If so, could you share which components you are using?

Hey @Aephir,
I haven’t tested much but my assistant was still running after a few hours of being on. Here’s a bullet list of what I have:

1 Like

Maybe that’s it, I’m trying with an older Pi3B+. I have a 4GB RPi4 lying around, if unloading processing doesn’t work, I might try that. Thanks!

@Mondmonarch I haven’t really used any tutorials, mainly used the official docs. Are you still having issues? If so, I can try to find what I’ve used?

@Aephir at the moment i’m short in time, but moving towards to the christmas vacation, i’ll try to get it working :slight_smile:
But i’m also using a Pi3B+, is that an issue?

No clue. I just know that my RPi3B+ seems to freeze, but I can’t say for certain if this is because Matrix/Rhasspy is too much, of because of something something else. I’ll do some testing once I have a bit more time.

But if it is a limitation of the Pi3, there is the option to send audio for processing somewhere else (remote HTTP server) for the heavy lifting.

Tried Rhasspy on Stretch on a Pi 3B+ and was having issues getting it to work. On the same Pi 3B+ with Buster though, the assistant worked with the ALSA microphone of MATRIXIO-SOUND card 2, device 0 selected in “Settings”.

We’re still not sure about the reliability but we’re working on a base guide for Rhasspy to be released soon which will hopefully help people get set up with it.

Best,
Samreen

This would be great. I think as a consequence of the snips.ai shutdown lot’s of people are thinking about moving to rhasspy
Thank you very much in advance :smiley:

2 Likes

Hello!

Here is the basic set up guide for Rhasspy. We need to test more and play around with it but it should hopefully help you get it up and running!

Best,
Team MATRIX

Thanks, @Samreen, I’ll give it a try. Still no (full) support for Buster then, I suppose? Any idea of the timeline (I know you guys are working on it, and I’m not trying to be pushy, just trying to figure out how much effort I should put into Strertch at this point :slight_smile: )

And btw, you might want to link to this site (or this one for lite version) in the documentation, since your current link directs to a Buster download, while telling you to download Stretch.

@Aephir,

Thank you for pointing that out! We’ve updated the docs to say latest Raspbian image.

We do support Buster fully! The only things that we are aware of that are having some issues on Buster is the MATRIX HAL microphone examples which we are working on. The MATRIX kernel modules and all other functionality work fully on Buster.

For Rhasspy, Raspbian Buster seems to work the best so that is the OS I would recommend for the guide.

Best,
Samreen

I’m having trouble detecting my sentences. Currently i’m using a german assistant.
I’ve recorded the command of turning on the light and this was the outcome:

speech


This is how i defined the sentence:


sentences

[Schalte das ] Licht an
[Set the]_____light on

(roughly translated)




I have also added the words:

words

And it seems like Pocketsphinx is understand nonsense:

[DEBUG:91671068] PocketsphinxDecoder: aus das das an

log

Full log of two failed recodings:

[INFO:91671083] quart.serving: 192.168.178.60:59508 POST /api/stop-recording 1.1 200 112 919000
[DEBUG:91671076] __main__: {"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "speech_confidence": 0.01237707784759329, "slots": {}}
[ERROR:91671073] FsticuffsRecognizer: in_loaded
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/intent.py", line 208, in in_loaded
    assert recognitions, "No intent recognized"
AssertionError: No intent recognized
[DEBUG:91671070] __main__: aus das an an
[DEBUG:91671068] PocketsphinxDecoder: aus das an an
[DEBUG:91671067] PocketsphinxDecoder: Transcription confidence: 0.01237707784759329
[DEBUG:91671064] PocketsphinxDecoder: Decoded WAV in 0.8818035125732422 second(s)
[DEBUG:91670179] PocketsphinxDecoder: rate=16000, width=2, channels=1.
[DEBUG:91670174] __main__: Recorded 87404 byte(s) of audio data
[INFO:91667482] quart.serving: 192.168.178.60:59508 POST /api/start-recording 1.1 200 2 11119
[INFO:91666965] quart.serving: 192.168.178.60:59508 POST /api/stop-recording 1.1 200 94 47835
[DEBUG:91666954] __main__: {"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "speech_confidence": 0, "slots": {}}
[ERROR:91666946] FsticuffsRecognizer: in_loaded
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/intent.py", line 208, in in_loaded
    assert recognitions, "No intent recognized"
AssertionError: No intent recognized
[DEBUG:91666937] __main__: 
[DEBUG:91666932] PocketsphinxDecoder: Decoded WAV in 0.0035889148712158203 second(s)
[DEBUG:91666926] PocketsphinxDecoder: rate=16000, width=2, channels=1.
[DEBUG:91666922] __main__: Recorded 1964 byte(s) of audio data
[INFO:91666847] quart.serving: 192.168.178.60:59508 POST /api/start-recording 1.1 200 2 12449
[INFO:91601529] quart.serving: 192.168.178.60:59489 POST /api/stop-recording 1.1 200 113 1030755
[DEBUG:91601524] __main__: {"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "speech_confidence": 0.013586233066797477, "slots": {}}
[ERROR:91601519] FsticuffsRecognizer: in_loaded
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/intent.py", line 208, in in_loaded
    assert recognitions, "No intent recognized"
AssertionError: No intent recognized
[DEBUG:91601516] __main__: aus das das an
[DEBUG:91601514] PocketsphinxDecoder: aus das das an
[DEBUG:91601513] PocketsphinxDecoder: Transcription confidence: 0.013586233066797477
[DEBUG:91601510] PocketsphinxDecoder: Decoded WAV in 0.9977343082427979 second(s)
[DEBUG:91600510] PocketsphinxDecoder: rate=16000, width=2, channels=1.
[DEBUG:91600506] __main__: Recorded 115244 byte(s) of audio data
[INFO:91596902] quart.serving: 192.168.178.60:59489 POST /api/start-recording 1.1 200 2 13208
[INFO:91521663] quart.serving: 192.168.178.60:59452 GET /api/problems 1.1 200 295 202153
[INFO:91521661] quart.serving: 192.168.178.60:59451 GET /api/speakers 1.1 200 2495 287153
[INFO:91521564] quart.serving: 192.168.178.60:59450 GET /api/microphones 1.1 200 1754 189302
[INFO:91521465] quart.serving: 192.168.178.60:59447 GET /api/unknown-words 1.1 200 2 13288
[INFO:91521454] quart.serving: 192.168.178.60:59448 GET /api/profile 1.1 200 6423 21662
[INFO:91521449] quart.serving: 192.168.178.60:59453 GET /api/profiles 1.1 200 144 74587
[INFO:91521447] quart.serving: 192.168.178.60:59452 GET /api/profile 1.1 200 8600 72817
[INFO:91521378] quart.serving: 192.168.178.60:59449 GET /api/events/log 1.1 101 - 90339
[INFO:91521368] quart.serving: 192.168.178.60:59448 GET /api/phonemes 1.1 200 2602 14296
[INFO:91521365] quart.serving: 192.168.178.60:59447 GET /api/slots 1.1 200 2 16624
[DEBUG:91521358] __main__: Loading phoneme examples from /usr/share/rhasspy/profiles/de/phoneme_examples.txt

Is there anything i have missed during the setup?
Thank you for your support :smiley:

Hi @Mondmonarch,

Are you using the wakeword/is it working?

It seems like it can’t hear you.

Did you make sure to select “arecord directly” and the MATRIXIOSOUND direct hardware device for audio?

The setup is a bit sensitive right now so little differences in steps might affect the outcome. If you are using the wakeword, I’d recommend looking out for a line with ['arecord' '-q' '-r' '16000'...] in the logs as that line tells you what microphones are being selected in the -D parameter. They would have to be hw:CARD=MATRIXIOSOUND,DEV=0.

Let me know how it goes!

Best,
Samreen

Hi @Samreen

yes, i had the right settings selected. I was quite unhappy with the german version, so i switched to the english version. Everything is working fine now. I was even able to control my home assistant through node red ( WebSocket Events).

At first i tried to use Home Assistant Integration.


But I didn’t get it working. Especially the connection between rhasspy - home assistant and the handling on the home assistant side. A guide on this would be nice to have, but is not that urgent.

A visual feedback for the wakeword through the leds is the next thing on my todolist. Do you have any tips for me, on how to achieve this?

I’m also facing problems in changing the wakeword, i already failed with snowboy. Maybe on porcupine the jarvis model? :smiley:
How can i place a custom wakewordfile in the corresponding directory? If anybody got an idea, I’m happy for any replies.

Thank you very much and merry christmas :evergreen_tree:

1 Like

Hey @Mondmonarch,

I haven’t tried other wake word engines aside from Porcupine, but I know it has a few customs words to pick from here. Once downloaded, you can add the file inside the porcupine folder located in ~/.config/rhasspy

I forgot the exact path, but I think it should be inside the profile you’re using (en, es, etc…). You can then select the wake word file from Rhasspy’s web GUI.

If you haven’t already, you should take a look at the Rhasspy’s wake word docs.

For listening to wake word events, I submitted a feature request on GitHub that the author of Rhasspy is considering. However, they posted a workaround in the meantime.

Happy Holidays! :snowman:

1 Like

Hey @Carlos

My Matrix Voice now turns blue, if the wakeword is detected. I have setup an mqtt broker on my Raspberry Pi and linked the docker container with Rhasspy (runs on the same pi) to this broker.
Furthermore I wrote a Python programm, that connects to the mqtt broker aswell and subscribes to the rhasspy/en/transition/SnowboyWakeListener topic, as I’m using Snowboy to detect the wakeword. As I’ve only learned the Python basics, this program could probably be improved :grin:
To start the program on boot, I’ve created a Unit File in SYSTEMD.

  • I’m also planning to make LED circle more beautiful. (For example instead of turning on all LEDs at the same time, turn them on like a ring that starts closing: starting from the top LED, turning one led on each side (left and right) on and repeat this step untill the circle is complete :smiley: )

Maybe someone has got an advice, on how to achieve this goal.
I can also share my code, if anyone is interested.

Happy holidays :santa:

1 Like

Nice!

I think you can do something like this for the LED sequence you want. I don’t have a Pi near me so you’re gonna have to make some minor edits to render it on your MATRIX device.

from math import trunc
from time import sleep

ledLength = 18
everloopImage = ["black" for i in range(ledLength)]

for i in range(trunc(ledLength/2)):
    everloopImage[i] = "blue"
    # matrix.led.set(everloopImage)
    sleep(1)

    everloopImage[ledLength-1-i] = "blue"
    # matrix.led.set(everloopImage)
    sleep(1)
2 Likes