Use speaker connection from Matrix Voice whis Snips

Hey Thomas,

I haven’t been able to try out MaryTTS yet, maybe this week, to see if the MATRIX Voice audio jack will work with it.

There are steps in the Rhasspy docs here as well, if you have time.

Best,
Samreen

Hey Samreen,

thank you for your answer. I’m wondering whether going via MaryTTS is the best way. Of course, it could be one of maybe several workarounds. I’ve investigated a bit and found that the issue to play other than 44100 Hz wav files is very much older than my request.

@Romkabouter has already been in touch to this issue with @yoelrc88l:

Unfortunately I cannot see if a solution had been found. However, my issue seems to be a common request. So if a working solution is available I would prefer this instead of a workaround. Do you agree or do you have a different view?

Best
Thomas

I have setup Rhasspy with the audio output through the speaker connectors.

I am using these speakers:

4 Ohm 3W (maybe that should only be 1.5W per channel however?)

I can hear the .wav-files being played, but they sound quite bad with a “crack” etc. On my PC they sound nice.

Is there a problem with the WAV file? Or output format? Or is it the speaker?

Please note that using the volume command
amixer set PCM 100%

does not really change the volume at all. Any advice to this?

Thank you!

Yes and no. I resample everything other than 44100 stereo

Hi Thomas,

What you are saying makes sense.

We had a simple way for the MATRIX Voice speaker output to work with all sampling rates while connected to the Raspberry Pi and that involved using plughw in the default speaker section in /etc/asound.conf as shown below.

pcm.speaker {
  type plug
  slave {
    pcm "plughw:2,1"
    rate 44100
  }
}

plughw in ALSA takes care of all the conversions needed to play at 44100 Hz for the MATRIX Voice audio jack output.

I tried to get the above method working with Rhasspy but for some reason using plughw with Rhasspy resulted in no sound output at all. We suspect this may be related to how we input the soundcard /dev/snd directly into the Rhasspy docker container, therefore perhaps negating the plughw feature in ALSA. This may also be the reason why dsnoop does not seem to be working with Rhasspy.

Since I was unsuccessful with the above attempts, I figured MaryTTS might be a good next workaround for the Rhasspy system as plughw does not seem to be working with it. It might help to post this general issue in the Rhasspy community as well to see if anyone knows of a solution.

Best,
Samreen

@SpaceGlider,

It could be the sampling rate of the WAV file as we were discussing in this thread. I doubt it’s your speaker, though you can quickly test that with a stock wav file from your terminal. EDIT: this should work since multiple apps should be able to play audio through the same speakers.

wget https://goo.gl/CDF6sf -O ./audio-sample.wav
aplay -D "hw:2,1" ./audio-sample.wav

It is possible that the volume command is not working due to it being run outside the docker container. You could try to run the bash command inside the container following the steps here.

Best,
Samreen

@Romkabouter

Hey Samreen and Romkabouter,

thank you for your answers. I will do some tests on my own and will post this issue int the Rhasspy community now. I am aware that the Rhasspy team is currently focussed on the new 2.5 release. Maybe they can put my request to the roadmap, if not a solution has already been found.

Thank you again. I will keep you informed.

Best
Thomas

1 Like

Thank you for the answer!

Interestingly, when I put the wav you suggested into the pi and play it, it sounds very nice. But when I download the Rhasspy “beeps” from here:

and put it on the same folder and play it, it does not sound very nice (crackling) although both files have the same codec (according to VLC). The files sound nice on my PC.

Could it be a problem with the power supply? I have the Matrix Voice on a RPi 4 with the USB-C power supply. I could add another line to the Matrix Micro-USB (if it is allowed to have both?).

Best
David

I was able to lower the volume like this (also for Rhasspy) and the crackling went away.

amixer -c 2 set Playback_Volume 40%

@Samreen still interested in if supplying the Matrix voice with power through the RasPi 4B AND the micro-usb on the Matrix Voice board is possible, maybe this would help?

@SpaceGlider,

I would not recommend double-powering the board. I don’t think it would help.

Glad the cackling went away!

Best,
Samreen

hello on rhasspy 2.5pre all work perfect with rhasspy and matrix voice
with arecord on: hw:CARD=MATRIXIOSOUND,DEV=0
and aplay on hw:CARD=MATRIXIOSOUND,DEV=1
but when you want to use picotts or other for tts i have error:

AudioServerException: Command '['aplay', '-q', '-t', 'wav', '-D', 'hw:CARD=MATRIXIOSOUND,DEV=1']' returned non-zero exit status 1.

have you any idea ?

@nicoxygen,

This is likely due to the sampling rate of the text-to-speech WAV files. The MATRIX Voice audio jack is compatible with 44100 Hz 16-bit stereo sound and the TTS WAV files need to be of that format to work.

There is a workaround for general RPi audio applications for this using plughw as I discussed above but I have not been able to get that working with Rhasspy inside the docker container.

Best,
Samreen

Hey Romkabouter,

I still have not found a solution for speaker output of 16kbps files on Matrix voice as the are generated by rhasspy tts. The speakers are working fine with the beep files for wakeword, recognition and error. However, some of my scripts create responses that can only be heard on ear phones connected to my raspberry pi, but not on the speakers connected to Matrix voice.

How did you manage to resample?
Are you doing so using your Matrix Voice ESP32 MQTT Audio Streamer?

Since I am only interested in the streaming functionality, but not in the ESP32 features, I have installed this application although I do not own the ESP32 version and got the mqtt broker up and running.

Is it now possible to use it for sound conversion? If not, how did you do the conversion?

PS: I am using MaryTTS.

Best
Thomas

That was a real challenge indeed, but I came across speex and there was some starting point
https://www.speex.org/

A lot of resampling was just to heavy to resample incoming streaming audio, but after some struggles I managed to get it working.
If you check the platformIO folder of my repo here https://github.com/Romkabouter/Matrix-Voice-ESP32-MQTT-Audio-Streamer/tree/master/PlatformIO
There is a task AudioPlayTask, which might lead you into the direction you need.
I use the function speex_resampler_process_interleaved_int, but with a low quality setting in order to process streams.
This is however a limitation of the esp32, faster cpu will give better results.

That is a bit strange, I would expect the speakers not to work when connected to a Pi or work.
Not work for some wave files, but not for the responses. How do you play the responses?
The audio play setting is the same for wakeword and audio play right?

Thank you very much for your answers.

This looks strange but is not. Samreen gave this answer, which is seems quit logical to me:

The responses Rhasspy creates by MaryTTS contain a 16000 Hz signal that is not compatible with the Matrix voice. That’s why I need to convert the format.

My responses are created by javascripts as strings (e.g. “It is 5 o’clock pm”) and the sent to Rhasspy’s HTTP api/text-to-speech. (BTW I am testing the same with mqtt on hermes/tts/say. But this is not relevant in this context)

Depending on the audio play setting, which is always the same for all kind of files, I can hear beep files and responses on the Pi’s audio jack. When I change the audio play setting to the Matrix Voice speakers I can only hear beep files but not the responses.

That’s why I am looking for a way to convert the stream. From my perspective there are two possible ways:

a) cause MaryTTS not to generate a 16kHz file but a 44,1 kHz file instead (that was @Samreen 's suggestion) , or
b) convert the stream on the Matrix Voice using your audio streamer

I tried to follow your hint and link, but could not find the mentioned task there:

Could it have been moved?

Before we (I mean you and me) spend more time on this issue I would like to ask you, whether your software also runs on a normal Matrix Voice without ESP32 module (WiFi and LED control are provided by my Raspberry Pi)? If so, can your software differ between the beep files that only need to be passed through and the responses that need to be converted?

If your software can’t do so, can you also help with configuring MaryTTS or should I ask someone else in the Rhasspy forum?

Sorry for all these questions. But I am trying not to mislead you and waste your time.

Best
Thomas

The beep files are not 44100hz either, but vary. Most likely 22050. So that makes it strange to me.

I am not sure what you mean by “moved”?

No, sorry about that.
The good news is that if you have the Matrix Voice attached to a Pi, you have much more options to convert incoming audio, for example sox.
Also, you can change the output sample rate via asound.rc and set Rhasspy output to aplay.

No worries about wasting my time :slight_smile:

1 Like

@tobetobe what is your settings to get matrix voice working with 2.5? I’m trying to upgrade from 2.4 and can’t get it to work. I’m on a regular, non-ESP model attached to Pi 3B+ running Stretch (because of the kernel issue that is going around for Buster). My settings are working and the microphone works for 2.4, but it gets killed upgrading to 2.5

Good morning, unfortunately I am short on time currently. So I cannot go into details but would like to give you a quick answer at least. First I am running a master/satellite system with Raspbian GNU/Linux 10 on my two satellites (one is a Pi3, the other one a Pi4). Rhasspy 2.5 is running in docker. In order to also use and control the LEDs I installed HermesLEDcontrol, which you can find in the Rhasspy forum or on GitHub. HermesLEDcontrol also installs all Matrix drivers and the Matrix kernel. So it is not necessary to install them separately.

Then: It’s my and other users’s experience that it is best to install an external MQTT-Broker (e.g. sudo apt install mosquitto) on your Rhasspy master, which allows communication to external applications (e.g. Hass.io or others) via MQTT.

And these are my settings for my satellite:

{
“command”: {
“webrtcvad”: {
“before_sec”: “0.5”,
“min_sec”: “1”,
“silence_sec”: “0.5”,
“skip_sec”: “0”,
“speech_sec”: “0.3”,
“vad_mode”: “1”
}
},
“dialogue”: {
“system”: “rhasspy”
},
“intent”: {
“fuzzywuzzy”: {
“min_confidence”: “0”
},
“system”: “hermes”
},
“microphone”: {
“arecord”: {
“device”: “default:CARD=MATRIXIOSOUND”,
“siteId”: “Satellite1”,
“udp_audio_port”: “12202”
},
“system”: “arecord”
},
“mqtt”: {
“enabled”: “true”,
“host”: “192.168.13.155”,
“port”: “1883”,
“site_id”: “Satellite1”
},
“sounds”: {
“aplay”: {
“device”: “hw:CARD=MATRIXIOSOUND,DEV=1”
},
“system”: “aplay”
},
“speech_to_text”: {
“system”: “hermes”
},
“text_to_speech”: {
“satellite_site_ids”: “Satellite1”,
“system”: “hermes”
},
“wake”: {
“pocketsphinx”: {
“threshold”: 9.999999999999999e-33,
“udp_audio”: “12202”
},
“porcupine”: {
“keyword_path”: “americano.ppn”,
“sensitivity”: “0.9”,
“udp_audio”: “12202”
},
“satellite_site_ids”: “Satellite1”,
“snowboy”: {
“model”: “computer.umdl”,
“sensitivity”: “0.45”,
“udp_audio”: “12202”
},
“system”: “snowboy”
}
}
… and for my master:

{
“command”: {
“webrtcvad”: {
“min_sec”: “1”,
“silence_sec”: “1.5”,
“speech_sec”: “0.5”
}
},
“dialogue”: {
“satellite_site_ids”: “Satellite1, Satellite2”,
“system”: “hermes”
},
“intent”: {
“satellite_site_ids”: “Satellite1, Satellite2”,
“system”: “fsticuffs”
},
“mqtt”: {
“enabled”: “true”,
“host”: “192.168.13.155”,
“site_id”: “Master”
},
“sounds”: {
“error”: “{RHASSPY_PROFILE_DIR}/wav/beep.wav", "recorded": "{RHASSPY_PROFILE_DIR}/wav/answer.wav”,
“wake”: “${RHASSPY_PROFILE_DIR}/wav/question.wav”
},
“speech_to_text”: {
“kaldi”: {
“mix_weight”: “0.2”
},
“satellite_site_ids”: “Satellite1, Satellite2”,
“system”: “kaldi”
},
“text_to_speech”: {
“marytts”: {
“language”: “de-DE”,
“locale”: “de”,
“url”: “http://192.168.13.155:59125/process”,
“voice”: “bits1-hsmm”
},
“satellite_site_ids”: “Satellite1, Satellite2”,
“system”: “marytts”
}
}

With these settings I can receive commands by the mics, trigger my devices, and get beeps on the Matrix speaker outputs. Still I don’t get spoken responses that are created by TTS. These ar only avaliable on Raspi’s headphone plug.

Hope this helps a bit. If you have further questions, I will respond on Sunday.

Best
Thomas

Thanks so much for the quick reply. I was hopeful that your settings would get it working, but alas no joy. I adjusted the device settings in Rhasspy for the microphone, and it says that its working, but no output/input from the matrix voice when I test it. In addition, I get the following message when I look at the output during docker-compose up

arecord: main:828: audio open error: Connection refused
arecord: main:828: audio open error: Invalid argument
arecord: main:828: audio open error: No such file or directory