Romkabouter/ESP32-Rhasspy-Satellite

Attempt to get it to work with Voco

flatsiedatsie opened this issue · 81 comments

This is a continuation of discussion here.

I've managed to get Snips to recognise the wake-word, but only right after booting the Atom Echo.

Snips does recognise that there is audio input.

While in idle mode, Snips Watch indicates that audio is being heard.

[14:55:18] [VoiceActivity] Down on site atomecho
[14:55:21] [VoiceActivity] Up on site atomecho
[14:55:22] [VoiceActivity] Down on site atomecho
[14:55:35] [VoiceActivity] Up on site atomecho
[14:55:53] [VoiceActivity] Down on site atomecho
[14:55:55] [VoiceActivity] Up on site atomecho
[14:55:56] [VoiceActivity] Down on site atomecho
[14:56:00] [VoiceActivity] Up on site atomecho

If I press the button to start a session, a session is created, and the dialogue manager listens to the stream from the Atom Echo. But the voice input is not recognised as a voice command:

[14:56:10] [Dialogue] was asked to start a session on site atomecho
[14:56:10] [Asr] was asked to stop listening on site atomecho
[14:56:10] [Hotword] was asked to toggle itself 'off' on site atomecho
[14:56:10] [Dialogue] session with id 'fe685174-5053-4651-8756-8cb3b066003e' was started on site atomecho
[14:56:10] [Asr] was asked to listen on site atomecho
[14:56:11] [VoiceActivity] Down on site atomecho
[14:56:12] [VoiceActivity] Up on site atomecho
[14:56:15] [VoiceActivity] Down on site atomecho
[14:56:16] [VoiceActivity] Up on site atomecho
[14:56:18] [VoiceActivity] Down on site atomecho
[14:56:19] [VoiceActivity] Up on site atomecho
[14:56:26] [Dialogue] session with id 'fe685174-5053-4651-8756-8cb3b066003e' was ended on site atomecho. The session was ended because one of the component didn't respond in a timely manner
[14:56:26] [Asr] was asked to stop listening on site atomecho
[14:56:26] [Hotword] was asked to toggle itself 'on' on site atomecho

If I don't speak into the Raspberry Pi version, then things look a bit different.

[15:07:45] [VoiceActivity] Up on site azrxidia
[15:07:46] [Hotword] detected on site azrxidia, for model hey_snips
[15:07:46] [Asr] was asked to stop listening on site azrxidia
[15:07:46] [Hotword] was asked to toggle itself 'off' on site azrxidia
[15:07:46] [Dialogue] session with id '16427483-191a-4c39-9f0b-199dd4cb0e7e' was started on site azrxidia
[15:07:46] [Asr] was asked to listen on site azrxidia
[15:07:46] [VoiceActivity] Up on site atomecho
[15:07:47] [VoiceActivity] Down on site azrxidia
[15:07:50] [Asr] captured text "" in 4.0s
[15:07:50] [Asr] was asked to stop listening on site azrxidia
[15:07:50] [Dialogue] session with id '16427483-191a-4c39-9f0b-199dd4cb0e7e' was ended on site azrxidia. The session was ended because the platform didn't understand the user
[15:07:50] [Asr] was asked to stop listening on site azrxidia

So what would support the idea that some MQTT message is missing.

As an aside, I also noticed that the wave header is slightly different:

ESP32

RIFF,WAVEfmt ?>}datay?u?y?t?z?v?|?~?q?v?u??s?z?u?y?t?v?v?w?y?q?o?u?|?x?y?n?t?t?u?r?r?v?s?x?q?w?s?x?m?v?s?o?t?s?y?s?w?u?o?~?w?s?t?s?u?y?~?s?~?t?u?{Հ?x?u?u?u?}?xՁ?y?x?x?zՀ?s?u?s?v?z?z?z?p?n?n?w?r?v?z?p?t?r?q?w?|?x?u?o?q?|?y?y?t?t?o?~?}?y?w?p?w?}?~?t?v?v?z?{?{?{?z?u?xՌՋ?~?}?w?yՄՅՂ?{?{?yՂՈ?|Մ?v?~ՂՃ?Հ?|?wՁ?Ղ?z?|?|Յ?ՂՀ?|?}Ղ???|?~Հ?~Մ?y?zՀՈՃՁՂ?}ՆՋ?Ձ?zՄՂՈՁ?Մ?Ո?ՇՂՃՂ?}?}??|ՅՃՄՄՃ?|Ձ?~ՊՏ?}ՂՀՅՉՅՇՂ?~?yՊՄՄՁ?x?~Ձ??~?{?~?zՂ?{?y?~?}?{?|?|?y?z??~?{Ղ?qՀ?uՂ?t?w?|?z?~?x?{?

USB microphone on Raspberry Pi:

RIFF4WAVEfmt ?>}tim??wdata????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????#")%##0$*%%AAHNMVZPHNWALKGC))????????!
???????/!>@>$$2*8@1-?>GIKB:ECV=3/3<19C?=5H=80)& 
                                                ) <@5/,?23JH;

I also check the output of the various commands in Mosquitto to find out which exact message was missing.

---SNIPS----


hermes/hotword/azrxidia/detected {"siteId":"azrxidia","modelId":"hey_snips","modelVersion":"workflow-hey_snips_subww_feedback_10seeds-2018_12_04T12_13_05_evaluated_model_0002","modelType":"universal","currentSensitivity":0.5,"detectionSignalMs":1614003432719,"endSignalMs":1614003432719}

hermes/asr/stopListening ä"siteId":"azrxidia","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e"å
hermes/hotword/toggleOff ä"siteId":"azrxidia","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e"å

hermes/dialogueManager/sessionStarted ä"sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e","customData":null,"siteId":"azrxidia","reactivatedFromSessionId":nullå

hermes/asr/startListening ä"siteId":"azrxidia","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e","startSignalMs":1614003432719å

hermes/audioServer/azrxidia/replayRequest {"requestId":"azrxidia-1614003432719","startAtMs":1614003432719,"siteId":"azrxidia"}
hermes/audioServer/azrxidia/replayResponse RIFF^WAVEfmt ?>}tim??wrpidazrxidia-1614003432719rprf
hermes/audioServer/azrxidia/replayResponse
hermes/audioServer/azrxidia/replayResponse
hermes/audioServer/azrxidia/replayResponse

hermes/asr/textCaptured ä"text":"what time is it","likelihood":1.0,"tokens":Ää"value":"what","confidence":1.0,"rangeStart":0,"rangeEnd":4,"time":ä"start":0.0,"end":1.05åå,ä"value":"time","confidence":1.0,"rangeStart":5,"rangeEnd":9,"time":ä"start":1.05,"end":1.17åå,ä"value":"is","confidence":1.0,"rangeStart":10,"rangeEnd":12,"time":ä"start":1.17,"end":1.3199999åå,ä"value":"it","confidence":1.0,"rangeStart":13,"rangeEnd":15,"time":ä"start":1.3199999,"end":2.1ååÅ,"seconds":2.0,"siteId":"azrxidia","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e"å
hermes/asr/stopListening ä"siteId":"azrxidia","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e"å
hermes/nlu/query ä"input":"what time is it","asrTokens":Ää"value":"what","confidence":1.0,"rangeStart":0,"rangeEnd":4,"time":ä"start":0.0,"end":1.05åå,ä"value":"time","confidence":1.0,"rangeStart":5,"rangeEnd":9,"time":ä"start":1.05,"end":1.17åå,ä"value":"is","confidence":1.0,"rangeStart":10,"rangeEnd":12,"time":ä"start":1.17,"end":1.3199999åå,ä"value":"it","confidence":1.0,"rangeStart":13,"rangeEnd":15,"time":ä"start":1.3199999,"end":2.1ååÅ,"intentFilter":Ä"createcandle:get_time","createcandle:set_value","createcandle:stop_timer","createcandle:set_timer","createcandle:get_value","createcandle:set_state","createcandle:get_boolean","createcandle:list_timers","createcandle:get_timer_count"Å,"id":"1eee4474-c274-4a09-a7a5-7a65229839fa","sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e"å
hermes/nlu/intentParsed ä"id":"1eee4474-c274-4a09-a7a5-7a65229839fa","input":"what time is it","intent":ä"intentName":"createcandle:get_time","confidenceScore":1.0å,"slots":ÄÅ,"sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e","alternatives":Ää"intentName":"createcandle:get_value","confidenceScore":0.06613055,"slots":ÄÅå,ä"intentName":"createcandle:list_timers","confidenceScore":0.048560027,"slots":ÄÅåÅå
hermes/intent/createcandle:get_time ä"sessionId":"f0b42455-fc95-4873-84ba-c6136b1dec3e","customData":null,"siteId":"azrxidia","input":"what time is it","asrTokens":ÄÄä"value":"what","confidence":1.0,"rangeStart":0,"rangeEnd":4,"time":ä"start":0.0,"end":1.05åå,ä"value":"time","confidence":1.0,"rangeStart":5,"rangeEnd":9,"time":ä"start":1.05,"end":1.17åå,ä"value":"is","confidence":1.0,"rangeStart":10,"rangeEnd":12,"time":ä"start":1.17,"end":1.3199999åå,ä"value":"it","confidence":1.0,"rangeStart":13,"rangeEnd":15,"time":ä"start":1.3199999,"end":2.1ååÅÅ,"asrConfidence":1.0,"intent":ä"intentName":"createcandle:get_time","confidenceScore":1.0å,"slots":ÄÅ,"alternatives":Ää"intentName":"createcandle:get_value","confidenceScore":0.06613055,"slots":ÄÅå,ä"intentName":"createcandle:list_timers","confidenceScore":0.048560027,"slots":ÄÅåÅå

also:
hermes/voiceActivity/azrxidia/vadDown {"siteId":"azrxidia","signalMs":1614003434199}
hermes/voiceActivity/azrxidia/vadUp ä"siteId":"azrxidia","signalMs":1614003432077å

----Atom Echo button----

hermes/dialogueManager/startSession æ"init":æ"type":"action","canBeEnqueued": falseå,"siteId":"atomecho"å

hermes/asr/stopListening æ"siteId":"atomecho","sessionId":"cd6118bf-a971-4921-a6b5-59aeb7967a3d"å
hermes/hotword/toggleOff æ"siteId":"atomecho","sessionId":"cd6118bf-a971-4921-a6b5-59aeb7967a3d"å

hermes/dialogueManager/sessionStarted æ"sessionId":"cd6118bf-a971-4921-a6b5-59aeb7967a3d","customData":null,"siteId":"atomecho","reactivatedFromSessionId":nullå

hermes/asr/startListening æ"siteId":"atomecho","sessionId":"cd6118bf-a971-4921-a6b5-59aeb7967a3d","startSignalMs":nullå

hermes/voco/atomecho/mute æ"mute": trueå
...

hermes/voiceActivity/atomecho/vadDown æ"siteId":"atomecho","signalMs":-386å

----Atom Echo hotword detected----

hermes/hotword/azrxidia/detected æ"siteId":"atomecho","modelId":"hey_snips","modelVersion":"workflow-hey_snips_subww_feedback_10seeds-2018_12_04T12_13_05_evaluated_model_0002","modelType":"universal","currentSensitivity":0.5,"detectionSignalMs":-70,"endSignalMs":-70å

hermes/voco/atomecho/play æ"sound_file": "start_of_input"å

hermes/audioServer/atomecho/audioFrame RIFF,WAVEfmt ?>ådata?????????ؽ??????????????????????????????????????????????ؽ????????غ????ؿؽ????غؿؾػؾؿ??ع??????ظ????????ؾ????ؾؾؾ????????ؽ??ؿ????ظ??????صطؽؼػغغ????????ؼ??ضؼؼشؾغ????ع??ؼ??ؼ??ظؼر??ػعػظ??ؾؼ??ذضظص??زؾشؼطؾظؽؾطؽخرخصؼذزشسزظخعػش??زطزػغطضضجزعج??ضرذ??غؼحذظ??ظرش??ؼذطصثزضص??سشش??رشص??عسصح??عظخؼجؽزػرخرشصشذظصحطظشغغصشسسسذسصغزضطشعغص??ؽظؿظرظطؼصؿ??عصغؼؽطػص???
hermes/asr/stopListening æ"siteId":"atomecho","sessionId":"a9e2c85d-cdfd-40ce-9ad6-ad30c9ed2868"å
hermes/audioServer/atomecho/audioFrame RIFF,WAVEfmt ?>ådata???غ??????ظؿؾ????ؼؾظغطغ??ظؿع??ط??ؾع??ططظعشظغ??ظؽعظؼطضسطضؼشرطسعسطظغؽؿ????صعصغ??غؿؼظط????????ػؾؽؼ????ؼ??صػظػظؾظػػععؼػ??ؾ??سؿشغػششرؽخغشػؽزؼضزسض??ضصض??سظػػرؽعؼػعظص??صغسؼؼػظؼظغؼؼظ????ظػ??ؿعغعظظ??ػع??ظحؽش??ضظعػطصذ??ظؿؽظطصؽصؽشغسسط??صؾؿغعغؽؽضعغ??ػػؽؽ??ؼص??ظدضعضصضغظخصػطؽضػتػشظرعصصدذشطزطخخظظذزظضصؾد??حرخ?
hermes/hotword/toggleOff æ"siteId":"atomecho","sessionId":"a9e2c85d-cdfd-40ce-9ad6-ad30c9ed2868"å

hermes/dialogueManager/sessionStarted æ"sessionId":"a9e2c85d-cdfd-40ce-9ad6-ad30c9ed2868","customData":null,"siteId":"atomecho","reactivatedFromSessionId":nullå

hermes/asr/startListening æ"siteId":"atomecho","sessionId":"a9e2c85d-cdfd-40ce-9ad6-ad30c9ed2868","startSignalMs":-70å

hermes/voco/atomecho/mute æ"mute": trueå

hermes/voiceActivity/atomecho/vadDown {"siteId":"atomecho","signalMs":-386}


What is the samplerate of voco?
This software output 16000 16bit mono, each MQTT messsage containt 512 bytes wavedata and 44 but header, a total of 556 bytes per message.

And this software does not support snips anymore, which version of snips does voco use?

I can see if I can get it to work with voco, but I won't put that in master due to the fact that snips is not supported.
I will create a new branch for it

That would rock! I was looking into the header to see if I had to change something there.

As far as I know, Voco uses the latest version of Snips.

Does Rhasspy use different audio headers from that last version of Snips?

The mystery is: why does it sometimes actually work?

Where can I find this "VoiceActivity" logs on such?

The audio produced is just a lot of small wave files. Rhasspy does not differ from snips with regards to that.
But the chunksize from Voco is different than what is send by the streamer

This is the wave format: http://soundfile.sapp.org/doc/WaveFormat/
From Voco: RIFF4WAVEfmt
From streamer: RIFF,WAVEfmt

The 4 and , are not a 4 and a , but represent the 4 bytes of the ChunkSize.
So if the chunk size in not the same then depending on how Voco is implemented, it might deal with the audio unexpectedly

The 4 and , are not a 4 and a , but represent the 4 bytes of the ChunkSize.

Ah, interesting. I figured it might matter. I also wonder if the tim part was some kind of time index.

To debug, I use these commands:

Snips Watch: ~/.webthings/addons/voco/snips/snips-watch -vvvv

Looking at ALL MQTT traffic: mosquitto_sub -v -h 192.168.2.167 -t 'hermes/#'
(mosquitto sub isn't installed by default, so sudo apt-get install mosquitto-clients will fix that)

And if you need to quickly restart the gateway for some reason: sudo systemctl restart webthings-gateway.service

Ah, interesting. I figured it might matter. I also wonder if the tim part was some kind of time index.

Yes, when I see this:
hermes/audioServer/azrxidia/replayResponse RIFF^WAVEfmt ?>}tim??wrpidazrxidia-1614003432719rprf
that does not really look like a good waveheader to me, it should somewhere contain the plain letter data as well
The link I posted was how a wave format header should look like, I do not know how the replayRespoonse is generated.

In the streamer I use a fixed header, because every wav audio send is the same length and format :)
I will ignore the replayResponse for now and will dive into voco a bit to find out what voco expects.

Yes I've only seen that replayRequest command once. It's not documented either, all I could find were two references to Snips code on Github. I'd just ignore it.

that does not really look like a good waveheader to me

0_0 :-)

If have tried a bare snips with the demo assistant with a mic attached to the pi, I get timeouts:

image

I think Snips is broken and this is actually causing the timeouts, I will try this setup with voco as well.

AI, that doesn't sound good... Is that with a different wav encoding? Are you using the 6.0 version?

No, this is Snips installed on a Pi with an attached mic. Nothing to do with this code.
I wanted to see which message are going back and forth to snips to see what is needed for a correct Dialogue session.

That is why I installed the demo assistant.
I followed https://docs.snips.ai/articles/raspberrypi/manual-setup, which is old but still the latest version :)

As you can see, the hotword is detected ok, I also tested the mic outside snips. All works.
But I still get a timeout, it should detect the weather intent

ok edit, no ASR is running ;)

Keeping my fingers crossed over here :-)

I followed this step and now snips is working :)

snipsco/snips-issues#161

progress :)

image

Hey wow!! Great work!

What's still left to do? It looks to me like a 100% succes?

This was a test with a Pi with a build-in mike.
To verify that snips still works.

The next step for me is to check and adjust the code for the streamer to get Snips going (again).
When that works, I can move on to Voco :)
First with a build-in mike and if that works the streamer. So there is a road ahead :)

Ah I see. Are you sure the effort is worth it? We could just wait for Voco to move to Rhasspy. That has to happen at some point anyway.

Well, it would be nice to find what the issue is, but that is just me :D

I see in the messages RIFF4, as you also already found. The bytes send per message is also 572 instead of the 556 that the streamer sends.
I believe this is the root cause of the bad performance, the wave audio does not match.

That's what I figured as well.

Sorry about my lack of commitment, my sd card died so I had to start over which I did not have time for.
This is still on my to-do list however :)

No worries. Pretty busy overhere as well :-)

I have the code working with snips, however there are some header changes which I can't get right atm.
When hey snips is spoken, it works if you speak it a bit slower, so it is most likely a wave format thing.
I have recorder an audio file which triggers the hotword 100% with the code. I will attach that to the branch :)

That's great news! I'm going to check this out asap!

This is the result of the audio file
image

After the hotword it picks up the audio as expected
image

The timeout occurs because I have no intent handler.

Where can I find the code? I had a look at the voco branch, but that doesn't seem to be it?

Just pushed it. I have also included a record.py which records a stream for a couple of seconds.
I was experimenting with the header, but found the I did not have to make a change strange enough.

Just pushed branch works with met Atom Echo, but the recordlevel seems to be very low. This will also be the case with the master branch I assume

Cool, I will try it now!

It took a bit of work to be able to upload it via the Arduino IDE again. I had to strip out the LED parts.

Now that it uploads, I get this error. Nothing to worry about, I just have to look into it.

22:17:17.029 -> Creating I2Stask
22:17:17.029 -> Enter WifiDisconnected
22:17:17.029 -> Total heap: 293476
22:17:17.029 -> Free heap: 231364
22:17:19.401 -> Enter WifiConnected
22:17:19.401 -> Connected to Wifi with IP: 192.168.2.137, SSID: mywifiname, BSSID: B0:95:75:9F:FE:CF, RSSI: -49
22:17:19.438 -> Enter MQTTDisconnected
22:17:19.438 -> Connecting MQTT: 192.168.2.166, 1883
22:17:19.475 -> end of setupEnter MQTTConnected
22:17:19.586 -> Connected as atomecho
22:17:19.586 -> Enter Idle
22:17:19.586 -> Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
22:17:19.586 -> Core 0 register dump:
22:17:19.586 -> PC      : 0x4015618e  PS      : 0x00060b30  A0      : 0x800d11ec  A1      : 0x3ffde570  
22:17:19.586 -> A2      : 0x00000000  A3      : 0x3ffde5e0  A4      : 0x00000200  A5      : 0x3ffde5ac  
22:17:19.623 -> A6      : 0x00000064  A7      : 0x00000000  A8      : 0x00000001  A9      : 0x0000000b  
22:17:19.623 -> A10     : 0x3ffb96ac  A11     : 0x3ffde58f  A12     : 0x00000000  A13     : 0x00000008  
22:17:19.623 -> A14     : 0x00060b23  A15     : 0x00000000  SAR     : 0x00000000  EXCCAUSE: 0x0000001c  
22:17:19.623 -> EXCVADDR: 0x00000014  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  
22:17:19.660 -> 
22:17:19.660 -> ELF file SHA256: 0000000000000000
22:17:19.660 -> 
22:17:19.660 -> Backtrace: 0x4015618e:0x3ffde570 0x400d11e9:0x3ffde5a0 0x400d2174:0x3ffde5d0 0x40089bce:0x3ffdea50
22:17:19.660 -> 
22:17:19.660 -> Rebooting...

Got a bit further.

Should I change some of the settings to get it to continuously stream audio? I'm assuming I should not use hotward detection.

The Connected as null is a bit strange. I tried adding siteid to the config file, but that didn't seem to solve it.

22:41:39.027 -> Rebooting...
22:41:39.027 -> ets Jun  8 2016 00:22:57
22:41:39.027 -> 
22:41:39.027 -> rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
22:41:39.027 -> configsip: 188777542, SPIWP:0xee
22:41:39.027 -> clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
22:41:39.027 -> mode:DIO, clock div:1
22:41:39.027 -> load:0x3fff0018,len:4
22:41:39.061 -> load:0x3fff001c,len:1044
22:41:39.061 -> load:0x40078000,len:10124
22:41:39.061 -> load:0x40080400,len:5856
22:41:39.061 -> entry 0x400806a8
22:41:39.386 -> Boo⸮M5Atom initializing...Loading configuration
22:41:39.574 -> {
22:41:39.574 ->   "mqtt_host": "192.168.2.166",
22:41:39.574 ->   "mqtt_port": 1883,
22:41:39.574 ->   "mqtt_user": "",
22:41:39.574 ->   "mqtt_pass": "",
22:41:39.574 ->   "sideid": "atomecho",
22:41:39.574 ->   "mqtt_valid": true,
22:41:39.574 ->   "mute_input": false,
22:41:39.574 ->   "mute_output": false,
22:41:39.574 ->   "amp_output": 0,
22:41:39.574 ->   "brightness": 30,
22:41:39.574 ->   "hotword_brightness": 100,
22:41:39.574 ->   "hotword_detection": 0,
22:41:39.574 ->   "volume": 100,
22:41:39.611 ->   "gain": 5
22:41:39.611 -> }
22:41:39.611 -> Creating I2Stask
22:41:39.611 -> Enter WifiDisconnected
22:41:39.611 -> Total heap: 293084
22:41:39.611 -> Free heap: 230916
22:41:43.910 -> Enter WifiConnected
22:41:43.910 -> Connected to Wifi with IP: 192.168.2.137, SSID: sterrenkijker_nomap, BSSID: B0:95:75:9F:FE:CF, RSSI: -50
22:41:43.945 -> Enter MQTTDisconnected
22:41:43.945 -> Connecting MQTT: 192.168.2.166, 1883
22:41:43.945 -> end of setupConnect failed, retry
22:41:48.932 -> Audio connected: 1, Async connected: 0
22:41:48.932 -> Enter MQTTDisconnected
22:41:48.932 -> Connecting MQTT: 192.168.2.166, 1883
22:42:20.105 -> Connect failed, retry
22:42:20.105 -> Audio connected: 0, Async connected: 0
22:42:20.105 -> Enter MQTTDisconnected
22:42:20.105 -> Connecting MQTT: 192.168.2.166, 1883
22:42:38.203 -> Connect failed, retry
22:42:38.203 -> Audio connected: 0, Async connected: 0
22:42:38.203 -> Enter MQTTDisconnected
22:42:38.240 -> Connecting MQTT: 192.168.2.166, 1883
22:42:56.724 -> Connect failed, retry
22:42:56.724 -> Audio connected: 0, Async connected: 0
22:42:56.724 -> Enter MQTTDisconnected
22:42:56.724 -> Connecting MQTT: 192.168.2.166, 1883
22:43:15.229 -> Connect failed, retry
22:43:15.229 -> Audio connected: 0, Async connected: 0
22:43:15.229 -> Enter MQTTDisconnected
22:43:15.229 -> Connecting MQTT: 192.168.2.166, 1883
22:43:33.736 -> Connect failed, retry
22:43:33.736 -> Audio connected: 0, Async connected: 0
22:43:33.772 -> Enter MQTTDisconnected
22:43:33.772 -> Connecting MQTT: 192.168.2.166, 1883
22:43:52.241 -> Connect failed, retry
22:43:52.279 -> Audio connected: 0, Async connected: 0
22:43:52.279 -> Enter MQTTDisconnected
22:43:52.279 -> Connecting MQTT: 192.168.2.166, 1883
22:44:02.380 -> Connect failed, retry
22:44:02.380 -> Audio connected: 1, Async connected: 0
22:44:02.380 -> Enter MQTTDisconnected
22:44:02.380 -> Connecting MQTT: 192.168.2.166, 1883
22:44:02.492 -> Enter MQTTConnected
22:44:02.492 -> Connected as null
22:44:02.492 -> going from mqtt connected to idle
22:44:02.492 -> Enter Idle
22:44:02.492 -> still in idle
22:44:02.492 -> end of idle
22:44:02.492 -> Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
22:44:02.492 -> Core 0 register dump:
22:44:02.492 -> PC      : 0x40157776  PS      : 0x00060d30  A0      : 0x800d161c  A1      : 0x3ffde570  
22:44:02.492 -> A2      : 0x00000000  A3      : 0x3ffde5e0  A4      : 0x00000200  A5      : 0x3ffde5ac  
22:44:02.525 -> A6      : 0x00000064  A7      : 0x00000000  A8      : 0x0000000b  A9      : 0x00000068  
22:44:02.525 -> A10     : 0x3ffb20d0  A11     : 0x3ffde58f  A12     : 0x00000000  A13     : 0x00000008  
22:44:02.525 -> A14     : 0x00060d23  A15     : 0x00000000  SAR     : 0x00000000  EXCCAUSE: 0x0000001c  
22:44:02.525 -> EXCVADDR: 0x00000014  LBEG    : 0x00000000  LEND    : 0x00000000  LCOUNT  : 0x00000000  
22:44:02.525 -> 
22:44:02.560 -> ELF file SHA256: 0000000000000000
22:44:02.560 -> 

In case you're curious, I've uploaded my current code here. I will look into it more this weekend.
https://github.com/flatsiedatsie/voco_mini_sat

Should I change some of the settings to get it to continuously stream audio?

Yes, the connected as null seems to imply that the config.siteid is empty.
You should be able to browse to 192.168.2.137 and check the siteid.

You have "sideid": "atomecho", in your config, that is why it is not filled most probably.
Should be siteid in this file: https://github.com/flatsiedatsie/voco_mini_sat/blob/main/data/config.json

That setting is everywhere, so it might also be the cause of the crashes, but I cannot be sure about that

You have "sideid": "atomecho", in your config, that is why it is not filled most probably.

I added that after I noticed that the value was null.

I didn't realise I should visit the device's webpage.

Ah, I now see I made a type in the config file. Thanks.

This is where I'm at now. It connects to wifi, I can open the server config page (cool). It says audio is connected, but MQTT isn't. When it does suddenly connect, it crashes.

07:57:27.321 -> Boo�M5Atom initializing...Loading configuration
07:57:27.541 -> {
07:57:27.541 ->   "siteid": "ATOMECHO",
07:57:27.541 ->   "mqtt_host": "192.168.2.166",
07:57:27.541 ->   "mqtt_port": 1883,
07:57:27.541 ->   "mqtt_user": "",
07:57:27.541 ->   "mqtt_pass": "",
07:57:27.541 ->   "mqtt_valid": true,
07:57:27.541 ->   "mute_input": false,
07:57:27.541 ->   "mute_output": false,
07:57:27.541 ->   "amp_output": 0,
07:57:27.541 ->   "brightness": 30,
07:57:27.541 ->   "hotword_brightness": 100,
07:57:27.541 ->   "hotword_detection": 0,
07:57:27.541 ->   "volume": 100,
07:57:27.541 ->   "gain": 5
07:57:27.541 -> }
07:57:27.541 -> Creating I2Stask
07:57:27.541 -> Enter WifiDisconnected
07:57:27.541 -> Total heap: 293072
07:57:27.577 -> Free heap: 230748
07:57:32.604 -> Enter WifiConnected
07:57:32.604 -> Connected to Wifi with IP: 192.168.2.137, SSID: sterrenkijker_nomap, BSSID: B0:95:75:9F:FE:CF, RSSI: -58
07:57:32.641 -> Enter MQTTDisconnected
07:57:32.641 -> Connecting MQTT: 192.168.2.166, 1883
07:57:32.678 -> end of setupConnect failed, retry
07:57:37.604 -> Audio connected: 0, Async connected: 0
07:57:37.641 -> Enter MQTTDisconnected
07:57:37.641 -> Connecting MQTT: 192.168.2.166, 1883
07:57:55.890 -> Connect failed, retry
07:57:55.890 -> Audio connected: 0, Async connected: 0
07:57:55.890 -> Enter MQTTDisconnected
07:57:55.890 -> Connecting MQTT: 192.168.2.166, 1883
07:58:14.407 -> Connect failed, retry
07:58:14.407 -> Audio connected: 0, Async connected: 0
07:58:14.407 -> Enter MQTTDisconnected
07:58:14.407 -> Connecting MQTT: 192.168.2.166, 1883
07:58:22.184 -> Connect failed, retry
07:58:22.184 -> Audio connected: 1, Async connected: 0
07:58:22.184 -> Enter MQTTDisconnected
07:58:22.184 -> Connecting MQTT: 192.168.2.166, 1883
07:58:28.027 -> Connect failed, retry
07:58:28.027 -> Audio connected: 1, Async connected: 0
07:58:28.027 -> Enter MQTTDisconnected
07:58:28.027 -> Connecting MQTT: 192.168.2.166, 1883
07:58:33.045 -> Connect failed, retry
07:58:33.045 -> Audio connected: 1, Async connected: 0
07:58:33.045 -> Enter MQTTDisconnected
07:58:33.045 -> Connecting MQTT: 192.168.2.166, 1883
07:58:38.026 -> Connect failed, retry
07:58:38.026 -> Audio connected: 1, Async connected: 0
07:58:38.026 -> Enter MQTTDisconnected
07:58:38.026 -> Connecting MQTT: 192.168.2.166, 1883
07:58:43.031 -> Connect failed, retry
07:58:43.031 -> Audio connected: 1, Async connected: 0
07:58:43.031 -> Enter MQTTDisconnected
07:58:43.031 -> Connecting MQTT: 192.168.2.166, 1883
07:58:48.044 -> Connect failed, retry
07:58:48.044 -> Audio connected: 1, Async connected: 0
07:58:48.044 -> Enter MQTTDisconnected
07:58:48.044 -> Connecting MQTT: 192.168.2.166, 1883
07:58:53.599 -> Connect failed, retry
07:58:53.599 -> Audio connected: 1, Async connected: 0
07:58:53.599 -> Enter MQTTDisconnected
07:58:53.599 -> Connecting MQTT: 192.168.2.166, 1883
07:58:58.593 -> Connect failed, retry
07:58:58.593 -> Audio connected: 1, Async connected: 0
07:58:58.593 -> Enter MQTTDisconnected
07:58:58.629 -> Connecting MQTT: 192.168.2.166, 1883
07:59:03.593 -> Connect failed, retry
07:59:03.593 -> Audio connected: 1, Async connected: 0
07:59:03.631 -> Enter MQTTDisconnected
07:59:03.631 -> Connecting MQTT: 192.168.2.166, 1883
07:59:08.602 -> Connect failed, retry
07:59:08.602 -> Audio connected: 1, Async connected: 0
07:59:08.602 -> Enter MQTTDisconnected
07:59:08.602 -> Connecting MQTT: 192.168.2.166, 1883
07:59:13.614 -> Connect failed, retry
07:59:13.614 -> Audio connected: 1, Async connected: 0
07:59:13.614 -> Enter MQTTDisconnected
07:59:13.614 -> Connecting MQTT: 192.168.2.166, 1883
07:59:18.606 -> Connect failed, retry
07:59:18.606 -> Audio connected: 1, Async connected: 0
07:59:18.606 -> Enter MQTTDisconnected
07:59:18.606 -> Connecting MQTT: 192.168.2.166, 1883
07:59:23.610 -> Connect failed, retry
07:59:23.610 -> Audio connected: 1, Async connected: 0
07:59:23.610 -> Enter MQTTDisconnected
07:59:23.610 -> Connecting MQTT: 192.168.2.166, 1883
07:59:28.628 -> Connect failed, retry
07:59:28.628 -> Audio connected: 1, Async connected: 0
07:59:28.628 -> Enter MQTTDisconnected
07:59:28.628 -> Connecting MQTT: 192.168.2.166, 1883
07:59:33.618 -> Connect failed, retry
07:59:33.618 -> Audio connected: 1, Async connected: 0
07:59:33.618 -> Enter MQTTDisconnected
07:59:33.618 -> Connecting MQTT: 192.168.2.166, 1883
07:59:38.617 -> Connect failed, retry
07:59:38.617 -> Audio connected: 1, Async connected: 0
07:59:38.617 -> Enter MQTTDisconnected
07:59:38.617 -> Connecting MQTT: 192.168.2.166, 1883
07:59:43.615 -> Connect failed, retry
07:59:43.615 -> Audio connected: 1, Async connected: 0
07:59:43.615 -> Enter MQTTDisconnected
07:59:43.615 -> Connecting MQTT: 192.168.2.166, 1883
07:59:48.624 -> Connect failed, retry
07:59:48.624 -> Audio connected: 1, Async connected: 0
07:59:48.624 -> Enter MQTTDisconnected
07:59:48.624 -> Connecting MQTT: 192.168.2.166, 1883
07:59:53.618 -> Connect failed, retry
07:59:53.618 -> Audio connected: 1, Async connected: 0
07:59:53.618 -> Enter MQTTDisconnected
07:59:53.618 -> Connecting MQTT: 192.168.2.166, 1883
07:59:58.612 -> Connect failed, retry
07:59:58.612 -> Audio connected: 1, Async connected: 0
07:59:58.612 -> Enter MQTTDisconnected
07:59:58.612 -> Connecting MQTT: 192.168.2.166, 1883
08:00:03.612 -> Connect failed, retry
08:00:03.612 -> Audio connected: 1, Async connected: 0
08:00:03.612 -> Enter MQTTDisconnected
08:00:03.612 -> Connecting MQTT: 192.168.2.166, 1883
08:00:08.639 -> Connect failed, retry
08:00:08.639 -> Audio connected: 1, Async connected: 0
08:00:08.639 -> Enter MQTTDisconnected
08:00:08.639 -> Connecting MQTT: 192.168.2.166, 1883
08:00:13.608 -> Connect failed, retry
08:00:13.608 -> Audio connected: 1, Async connected: 0
08:00:13.608 -> Enter MQTTDisconnected
08:00:13.646 -> Connecting MQTT: 192.168.2.166, 1883
08:00:18.632 -> Connect failed, retry
08:00:18.632 -> Audio connected: 1, Async connected: 0
08:00:18.632 -> Enter MQTTDisconnected
08:00:18.632 -> Connecting MQTT: 192.168.2.166, 1883
08:00:23.607 -> Connect failed, retry
08:00:23.607 -> Audio connected: 1, Async connected: 0
08:00:23.645 -> Enter MQTTDisconnected
08:00:23.645 -> Connecting MQTT: 192.168.2.166, 1883
08:00:28.645 -> Connect failed, retry
08:00:28.645 -> Audio connected: 1, Async connected: 0
08:00:28.645 -> Enter MQTTDisconnected
08:00:28.645 -> Connecting MQTT: 192.168.2.166, 1883
08:00:33.632 -> Connect failed, retry
08:00:33.632 -> Audio connected: 1, Async connected: 0
08:00:33.632 -> Enter MQTTDisconnected
08:00:33.632 -> Connecting MQTT: 192.168.2.166, 1883
08:00:38.609 -> Connect failed, retry
08:00:38.645 -> Audio connected: 1, Async connected: 0
08:00:38.645 -> Enter MQTTDisconnected
08:00:38.645 -> Connecting MQTT: 192.168.2.166, 1883
08:00:43.633 -> Connect failed, retry
08:00:43.633 -> Audio connected: 1, Async connected: 0
08:00:43.633 -> Enter MQTTDisconnected
08:00:43.633 -> Connecting MQTT: 192.168.2.166, 1883
08:00:43.818 -> Enter MQTTConnected
08:00:43.818 -> Connected as ATOMECHO
08:00:43.818 -> going from mqtt connected to idle
08:00:43.818 -> Enter Idle
08:00:43.818 -> still in idle
08:00:43.818 -> end of idle
08:00:43.818 -> Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

From there it enters a reboot loop, probably because the router doesn't let it onto the network again quickly enough after the crash.

08:04:49.497 -> Connection Failed! Retry...
08:04:59.478 -> Connection Failed! Retry...
08:05:09.497 -> Connection Failed! Rebooting...

Perhaps something in the MQTT correspondence is not sitting well.

It says Audio connected: 1, Async connected: 0

This indicates that the streamer audio is connected, but the asynch client has a problem. I cannot tell why

What does the async client do? Is that the webserver?

Could this setting be related?

#define CONFIG_ASYNC_TCP_RUNNING_CORE 1

Maybe I have to transplant some more settings from this file into the Arduino file.

What does the async client do? Is that the webserver?

Could this setting be related?

#define CONFIG_ASYNC_TCP_RUNNING_CORE 1

Absolutely, by default that ask runs on core 0. Where wifi and other tasks run as well.

So I2Stask should run on core 1? I guess the audio streaming has core 1 all to itself?

In the code I see it running on core 0. I changed it to 1 here:

    if (i2sHandle == NULL) {
      Serial.println("Creating I2Stask");
      xTaskCreatePinnedToCore(I2Stask, "I2Stask", 30000, NULL, 3, &i2sHandle, 1); // core was 0, changed it to 1
    } else {
      Serial.println("We already have a I2Stask");
    }

No, the I2Stask should run on core 0. This is because the default is core 1 but the I2Stask should have as much cpu power as possible.
The MQTT task should run on core 1, but that is used internally be that client. Therefore you should define it.

Removing #define CONFIG_ASYNC_TCP_RUNNING_CORE 1 made it much more stable.

However, it seemed to run even better with I2Stask running on core 1 :-D

For my understanding:

  • I2Stask is the audio streamer? It core 0 reserved for it? And all the other things run on core 1?
  • Is the voco code currently set to permanently stream audio to the main controller after booting up? In theory the main controller could then detect a hotword in the stream. It currently can't get it to detect the hotword, so perhaps it's not streaming? // I found xEventGroupSetBits(audioGroup, STREAM); in the idle entry phase, so it would seem so.
  • What happens/should happen if I press the Atom Echo's button? Currently it triggers a "hotword detected". I've tried to then speak a command, but it hasn't recognised this yet.

Voco is receiving something if I press the button. If memory serves, this shows that is was ready to receive audio, but it didn't catch any intent, and then switches back to normal mode.

2021-05-31 18:38:00.197 INFO   : voco: MQTT message to topic hermes/hotword/toggleOn received on: azrxidia a.k.a. hostname thuis
2021-05-31 18:38:00.197 INFO   : voco: +
2021-05-31 18:38:00.198 INFO   : voco: {"siteId":"ATOMECHO","sessionId":null}
2021-05-31 18:38:00.198 INFO   : voco: +
2021-05-31 18:38:00.199 INFO   : voco: In unmute. current_control_name: Headphone
2021-05-31 18:38:00.199 INFO   : voco: No intent received
2021-05-31 18:38:00.200 INFO   : voco: siteId was in /toggleOn payload: ATOMECHO

I can also get the Echo to receive the response from Voco to play a sound (which it doesn't do yet).

18:37:17.020 -> Connected as ATOMECHO
18:37:17.020 -> Connected to asynch MQTT!
18:37:17.020 -> going from mqtt connected to idle
18:37:17.020 -> Enter Idle
18:37:17.020 -> still in idle
18:37:17.020 -> end of idle
18:37:25.252 -> Incoming MQTT message. Topic: hermes/hotword/toggleOff
18:37:25.252 -> toggleOff message was for us
18:37:25.252 -> SessionId in toggleOff
18:37:25.252 -> Hotword detected event
18:37:25.252 -> Enter HotwordDetected
18:37:50.999 -> Incoming MQTT message. Topic: hermes/hotword/toggleOn
18:37:50.999 -> toggleOn message was for us. Going to idle mode.
18:37:50.999 -> Enter Idle
18:37:50.999 -> still in idle
18:37:50.999 -> end of idleEnter MQTTDisconnected
18:37:50.999 -> Incoming MQTT message. Topic: 
18:37:51.032 -> hermes/voco/ATOMECHO/play
18:37:51.032 -> Connecting MQTT: 192.168.2.166, 1883
18:38:01.011 -> Connect failed, retry
  • I2Stask is the audio streamer? It core 0 reserved for it? And all the other things run on core 1?
    Yes
  • Is the voco code currently set to permanently stream audio to the main controller after booting up? In theory the main controller could then detect a hotword in the stream. It currently can't get it to detect the hotword, so perhaps it's not streaming?
    Yes, it is a permanent stream. You can check it by connecting to the broker and subscribe to hermes/audioServer/#
  • What happens/should happen if I press the Atom Echo's button? Currently it triggers a "hotword detected". I've tried to then speak a command, but it hasn't recognised this yet.
    The button triggers the hotword, so it changes state. If there is no stream, no command will be caught.

Based on 2 and 3, I think your code is not streaming.

I see you are using 18:37:51.032 -> hermes/voco/ATOMECHO/play.
Did you change the topics in General.hpp as well?
Because Snips uses hermes/audioServer/atomecho

No I left all that as it was. I only added that one extra subscription to see if it was getting anything in return. And it was. So MQTT seemed to be working.

I just checked, it the code does reach the point where it streams audio data onto the network continuously.

Do you see any messages on the audioFrame topic? hermes/audioServer/atomecho/audioFrame.

I don't know it topics are case sensitive, you might want to try all lowercase just to be sure.

I see this in the log as well:
18:37:50.999 -> still in idle
18:37:50.999 -> end of idleEnter MQTTDisconnected

Apparently, the code disconnects. Is there some logging on the broker?

Coïncidentally I was just checking this. I don't seen any data arrive on mosquitto_sub -t 'hermes/audioServer/ATOMECHO/#'.

I checked if the correct path was set in Arduino, and it seems ok: hermes/audioServer/ATOMECHO/audioFrame

14:15:54.191 -> Enter WifiDisconnected
14:15:54.191 -> Total heap: 295944
14:15:54.191 -> Free heap: 233724
14:15:56.444 -> Enter WifiConnected
14:15:56.444 -> Connected to Wifi with IP: 192.168.2.137, SSID: sterrenkijker_nomap, BSSID: B0:95:75:9F:FE:CF, RSSI: -45
14:15:56.444 -> Enter MQTTDisconnected
14:15:56.444 -> Connecting MQTT: 192.168.2.166, 1883
14:15:56.444 -> asyncclient connect was called
14:15:56.444 -> also reconnecting to audio
14:15:56.519 -> hermes/audioServer/ATOMECHO/audioFrame
14:15:56.519 -> end of setup
14:15:56.519 -> BOTH CONNECTED
14:15:56.519 -> Enter MQTTConnected
14:15:56.519 -> Connected as ATOMECHO
14:15:56.555 -> Connected to asynch MQTT!
14:15:56.555 -> going from mqtt connected to idle
14:15:56.555 -> Enter Idle
14:15:56.555 -> still in idle
14:15:56.555 -> end of idle. Stream was set to true.
14:15:56.555 -> Total heap: 293988
14:15:56.555 -> Free heap: 158844
14:15:56.555 -> Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

I checked that it was reaching the end of the i2stask too, where it copies the data into a variable.

Could the issue lie with the MQTT part not liking this message? Perhaps the larger MQTT buffer isn't set correctly. Or maybe it's just swamped? From what I could find online, that error could mean an out of memory issue.

I managed to receive audio! I modified the PubSubClient to have a larger buffer.

An example of the Snips output:

hermes/audioServer/azrxidia/audioFrame RIFF4WAVEfmt ?>}timc???ydata?????????j?F?\?Q?`???????????????????????!?-?3?_???y?q?????????
                                                                                                                                  #JkpgQSL4????????????????X?5????????x??{?????????????%?=?U?q???????????????	 ????????|?[?F?=?7????
                                                              ???????????????????????(?7?7?Y?n???????????????1J┌????????????????┬±D=5???????????????????????┴???????????????????#L┼─??????????
??????⎺─│≥┬V[??????????
                       ? <␉┌°⎻␍S@0=G▒⎺␉Y␤.????␉A??????????
␤␊⎼└␊⎽/▒┤␍␋⎺S␊⎼┴␊⎼/▒≥⎼│␋␍␋▒/▒┤␍␋⎺F⎼▒└␊ RIFF4WAVE°└├ ?>£├␋└└???≤␍▒├▒??????│?▒?┴?????????????????????????)4,O␍Y??????┤NM<7??????$%#EXSH=>+????????????
????????????≥?▒?▒?±?E?N?T?_?┴?⎼?┌?[?.?"??????????????????????7?I?@?\????????@▒?????'4D[±␌V▒??⎺±QQC>"????±.??????????????????????≥?????≠???????????????????????1G7$
                                                                                                                                                                  ????????????
 17
   -T␊┴?????????????
                    !<<@8?H␌XX─┌┼├\?????┴┼IE@  ??
␤␊⎼└␊⎽/??????????vbWytlP4ELZXiv|????????????????????lD@0/4????≤␍0-2'?????????????????????⎼????????????K°⎺??????????
????%=\{???????????(7L8@KB????cBDN(	??????????????u?b?P?V?K?L?5?+?%?#? ?.?*?<?S??o?Q?e?h?T?"??
?!?5?H?i?e?Q?`?d?w?f?l?????????w?e?^?W?]?U?g?h?[?H?D?,?*??!?:?8?>?,?O?S?d?t???????????u?????????????????????????????j?f?K?k?J?0?
                                                                                                                                ?
                                                                                                                                 ?????????????j?`?a?2?/?*?

And the Atom Echo's output

hermes/audioServer/ATOMECHO/audioFrame RIFF,WAVEfmt ?>}data?!?!?&?&?"?)?+????	???&?!?3??4?$?/?*?@?<?9?G?J?N???M?D?M?G???E???<?=?3?B?2?<?/?0?&?&?????? ?%??"?#?)?(?#??&?"?%?)?.?0?7?+?<?2?<?7?B???A?7?6?9?1?<?2?6?/?9?/?9?-?2?%? ?#? ?!? ?%?'?(?%?+?'?.?(?*?*?1?2?2?4?<?/?B?C?H?D?L?O?X?R?V?Q?S?T?Q?Q?O?[?[?V?U?O?N?P?G?I?G???D?8?;?-?4?0?(?+?&?+??,??,??!?#?!??*??'??%????#??-??0??0?'?0?4?4?3?-?5?0?1?/?6?0?0?4?/?,?0?-?&?&?$?&?&?%?!?-?%?+?'?,?.?-?,?1?-?8?&?6?4?9?-?<?A?8?D?N?L?I?I?Q?Q?E?F?H?>?=?>?:?@?9?>?,?9?/?:?&?)?'?&?!??$?????	????????
hermes/audioServer/azrxidia/audioFrame RIFF4WAVEfmt ?>}tim????ydata??
                                                                     ?*?3?Y?d?J?m?{?l?p?~???????????????%?F?3?[?p?p?????????????@FI+????????????????????????????????????????????????????????????????????????~?k?g?\?g?T?T?b?Q?[?l?g?R?T?V?G?N?X?l?n????~?|?~???????????????3Gdv?????????????????????kV^K/D?RMB+??????????????????????????????????????????????≥?┼?T?I?A?5?7?'?? ??
??&?$?6?'?6?3?(?,?(?????????????????????????????????????????????????(?/?G?F?<?3?(??1???
                                                                                       ????????????????⎻?D?>?H?Z?┐?π?┤?⎻?
␤␊⎼└␊⎽/▒┤␍␋⎺S␊⎼┴␊⎼/ATOMECHO/▒┤␍␋⎺F⎼▒└␊ RIFF,WAVE°└├ ?>£␍▒├▒(?!?%?,?1?4?9?9?@?8???<?D???N?P?U?W?◆?Q?[?R?]?T?P?[?Y?]?W?V?P?O?O?S?G?L?L?I?D?K?C?>???B?D?E?C?@?@?A???=?E?=???;?????@?9?@?>?C?B?E?D?J?F?F?@?D???=?>?@?@?6???9?D?9?5?:?8?5?%?/?$?-?.?)?/?-?/?'?)?/?5?3?1?8?2?6?>?E?;?E?C?G?M?J?I?O?G?W?G?P?I?O?U?M?N?R?L?J?E?C???5?C?1?B?@?4?5?3?=?8?-?:?/?4?)?+?)?,?(?5?:?*?5?,?5?*?)?6?8?4?:?7?@?5?F?C?E?J?D?L?H?V?K?P?L?V?V?J?N?N?U?O?S?M?O?G?E?H?G?H?I?D?L?=?E?<?G?I?C?J?>?G?F?K?T?I?U?O?S?T?W?Q?V?Y?P?]?\?␉?^?▒?◆?Y?]?X?Y?X?X?[?^?_?P?]?T?T?J?Q?Y?R?R?P?P?P?Q?O?M?N?R?Q?M?V?R?[?S?]?\?T?W?T?
␤␊⎼└␊⎽/▒┤␍␋⎺S␊⎼┴␊⎼/▒≥⎼│␋␍␋▒/▒┤␍␋⎺F⎼▒└␊ RIFF4WAVE°└├ ?>£├␋└????≤␍▒├▒⎺?┤???????????????????!?4?K?␋?????????????????????????????????????#????????????????????????????????????????
 ??????????????????????'0??????+<4Aab2
                                      &)IPe??????????
                                                     ./62V[]bn??????????????vI.?	%19

                                                                                           ?????wx^ZY]Yot???????????????yp?z?????????????????????????????????????????????????????bK???????????
                 )4AIB
hermes/audioServer/ATOMECHO/audioFrame RIFF,WAVEfmt ?>}data'?,?/?.?4?'?+?&?3?"?-?/?;?.?0?1?.?1?,?0?0?,?.?'?/?3?:?4?2?6?@?;?9?9?9?<?:???I?7?=?B?C?8?6???;?9?;?:?;?>?>?B???3?:?9???8?;?A?;?6?D?5?;?7???=?9?>?>?=?;?C?9?@?G?I?L?D?=?E?A?H? ?)??!?!?&?&?"?)?+?+?(?.?'?*?+?(?)?7?6?4?7?:???8?;?A?<?=?=?@?C?6?;?5?<?7?>?.?6?4?/?0?*??!??*? ??? ???????????
????%?$?,?"?)?1?.?-?1?1?=?.?9?5?=?;?@?A?B?>?J?G???B?D???:?=?4?;?5?=?:?>?7?9?2?9?8?/?=?:?=?/?8?<?6?=?=?=?6?5?9?7?9?:?<?3?8?8?=?@?C?;?;?C?=?=???>?8???8?9?5?9?6?7?1???9?:?;?2?8?-?2?(?.?,?(?/?.?0?5?A?:?8?9?D?:?1?:?7?
hermes/audioServer/azrxidia/audioFrame RIFF4WAVEfmt ?>}tim????ydata6$FENQN3BHSijs??????????????svaUJOFA+9>GO5$??

????????????u?h?i?C?C?M?N?T?7?R?f?t?n?h?r?a?k?o?s?n?o?????????v?n?h?|?d???????????????????????????!!'')&$;?DB????????????????????????
?????????????????????????????????????????????????????????????????????????? DCA_o????????????????????????rvZTLKZhov?caaZZ[IREVUT^dbh`hxg??????????????????????
hermes/audioServer/azrxidia/audioFrame RIFF4WAVEfmt ?>}tim�??ydata??????v?mb?zv????????????X^\Xo??vlo?s>H<??????????????????????????????????????
???&JHCW[aO@HM\XTD/BH=????????????????????????????????????????????????????????????????????????
                                                                                              ??????????????????─?_?[?␊?◆?◆?L?4?9?>?A?1?$?)?,??!?&?:?<?<?7?3???????
                                                                                                                                                                   ?*???S?␊?P?^?␋?W?\?⎻?─?≠?????????????????????????????????????????⎼?@?6?▒?␌?⎺??????≥?┘?\???$?!?9?J?V?£?⎻?M?V?G?@?N?◆?└?R?@?U?E?!??'??????????????????????????????????? ?#???
␤␊⎼└␊⎽/▒┤␍␋⎺S␊⎼┴␊⎼/▒≥⎼│␋␍␋▒/▒┤␍␋⎺F⎼▒└␊ RIFF4WAVE°└├ ?>£├␋└͊??≤␍▒├▒??5?V?F?Z?▒?┴?????????????????????????????????????????????????????│?·?≠???????????????┬?┼?[?␋???─???≤???????????????????????????????????????????????????????????????????????????????????????????????	 
                                                                                                 ???????????+@N[\±┐VG41'?071UCYA-)32-????????3G8,.)!7
?
 !66K/	???? 1*%=I?F.%-2&!1
????%&*
        ????????????????????????????????z?k?p?o?h?y?t?m?^?a?h?n?{???????????

hermes/audioServer/azrxidia/audioFrame RIFF4WAVEfmt ?>}timc???ydata
hermes/audioServer/ATOMECHO/audioFrame RIFF,WAVEfmt ?>}data

I see a difference between the two strings, where the Snips one has }timc???ydata where the Atom has }data only?

I managed to receive audio! I modified the PubSubClient to have a larger buffer.

Ah ok, this is the ("MQTT_MAX_PACKET_SIZE", 2000) setting in this file ;)

https://github.com/Romkabouter/ESP32-Rhasspy-Satellite/blob/master/PlatformIO/load_settings.py

Snips has an additional header, found here:
https://github.com/snipsco/hermes-protocol/blob/develop/hermes/src/ontology/audio_server.rs#L25..L87

I have no such extra bytes, which is why the header is different,
I have tested this the header with Snips and that worked for me, so I think it has some additional functionality for snips but is not causing issues if those bytes are not there.

Yes I already had a MQTT_MAX_PACKET_SIZE define in the code, but it seemed to not take. The makers of pubsubclient seemed to recommend using a function to change this vlue, so I added that.

audioServer.setBufferSize(MQTT_MAX_PACKET_SIZE);

We're getting closer :-)

I tried recording the audio using the tool you created. It worked! There is a slight metalic sound to it, but it's definitely understandable. The volume is very low though. I'll try playing with the gain option.

What is the gain range? What would you recommend for getting more volume?

Ah, 0 to 8 (from the web ui)

Gain is actualy only used in the Matrix Voice I think. Expect unexpected results!

Good news: I managed to get it to detect a hotword by shouting very loudly.

I'm looking closer at how the back-and-forth with Snips is going. After it detects the hotword, the ASR doesn't receive audio (timeout).

{"sessionId":"42b02e1c-331e-4aaf-abb1-5a548abedeec","customData":null,"siteId":"ATOMECHO","reactivatedFromSessionId":null}
{"sessionId":"42b02e1c-331e-4aaf-abb1-5a548abedeec","customData":null,"termination":{"reason":"timeout","component":"asr"},"siteId":"ATOMECHO"}

There is a doubling going on again it seems.

14:10:31.055 -> end of idle. Stream was set to true.
14:10:31.055 -> Total heap: 293320
14:10:31.055 -> Free heap: 147664
14:10:31.055 -> Incoming MQTT message. Topic: hermes/voco/ATOMECHO/play
14:11:10.843 -> Incoming MQTT message. Topic: hermes/hotword/azrxidia/detected
14:11:11.058 -> Incoming MQTT message. Topic: hermes/voco/ATOMECHO/play
14:11:11.058 -> Incoming MQTT message. Topic: hermes/hotword/toggleOff
14:11:11.093 -> toggleOff message was for us
14:11:11.093 -> SessionId in toggleOff:59a6374e-89c4-49b7-a654-d77f77a7384c
14:11:11.093 -> Hotword detected event
14:11:11.093 -> Enter HotwordDetected
14:11:11.093 -> -Semaphone something
14:11:11.093 -> -Re-stream
14:11:25.930 -> Incoming MQTT message. Topic: hermes/hotword/toggleOn
14:11:25.930 -> toggleOn message was for us. Going to idle mode.
14:11:25.930 -> hw-detected-go-back-to-idle
14:11:25.930 -> Enter Idle
14:11:25.968 -> still in idle
14:11:25.968 -> end of idle. Stream was set to true.
14:11:25.968 -> Total heap: 293412
14:11:25.968 -> Free heap: 150684
14:11:26.472 -> Incoming MQTT message. Topic: hermes/voco/ATOMECHO/play
14:12:32.626 -> One of them failed: Enter MQTTDisconnected
14:12:32.626 -> Audio connected: 0, Async connected: 0
14:12:32.626 -> Enter MQTTDisconnected
14:12:32.626 -> Connecting MQTT: 192.168.2.165, 1883
14:12:32.626 -> Connecting MQTT: 192.168.2.165, 1883
14:12:32.626 -> asyncclient connect was called
14:12:32.626 -> asyncclient connect was called
14:12:32.626 -> also reconnecting to audio
14:12:32.626 -> also reconnecting to audio
14:12:47.011 -> 
14:12:47.011 -> ELF file SHA256: 0000000000000000
14:12:47.046 -> 
14:12:47.046 -> Backtrace: 0x40088938:0x3ffbf9d0 0x40088bb5:0x3ffbf9f0 0x40140d30:0x3ffbfa10 0x400870c9:0x3ffbfa30 0x4000cff5:0x3ffde0d0 0x400db815:0x3ffde0f0 0x400db88e:0x3ffde130 0x400d156d:0x3ffde160 0x400d15a7:0x3ffde1f0 0x400d16c7:0x3ffde210 0x400d1fa2:0x3ffde230 0x40089c06:0x3ffde6b0
14:12:47.046 -> 
14:12:47.046 -> Rebooting...

These are some messages going to the ASR:

hermes/asr/stopListening {"siteId":"azrxidia","sessionId":"569ac21f-cde0-4004-be21-f6112640cfdf"}
hermes/asr/startListening {"siteId":"azrxidia","sessionId":"569ac21f-cde0-4004-be21-f6112640cfdf","startSignalMs":1623327749156}
hermes/asr/stopListening {"siteId":"azrxidia","sessionId":null}

hermes/asr/stopListening {"siteId":"ATOMECHO","sessionId":"bf788e48-fe11-4206-9469-5ac4ec3fd8bd"}
hermes/asr/startListening {"siteId":"ATOMECHO","sessionId":"bf788e48-fe11-4206-9469-5ac4ec3fd8bd","startSignalMs":-20}
hermes/asr/stopListening {"siteId":"ATOMECHO","sessionId":null}

The StartSignalMS seems to be a strange value: -20. Maybe that's because the time data isn't in the audio stream?

As I do not know what your code looks like, I do not know where the doubling occurs.

Is your asr listening? Depens on the snips.toml file I believe

The ASR does work for other satellites in the house, which are based on Voco/Snips. Perhaps they are sending an extra message.

The latest Arduino code can be found here: https://github.com/flatsiedatsie/voco_mini_sat

Ok, checking your code.

  1. why do you have the publish to asr/stopListening" on line 44? This actually stops the ASR from listening in the hotworddetected state
  2. Is that the exact code? Is this also p[rinted twice? Serial.println("Creating I2Stask"); If so, maybe the check f (i2sHandle == NULL) does not work as expected

Sharp eyes -) I was trying to stop and then restart the ASR, hoping that would fix the issue. But then I tried skipping the HotwordDetected state alltogether. So currently that code is never called. All the HotwordDetected state did, was to stop the stream and restart it, which I suspected wasn't needed if there wasn't on-board hotword detection being done.

I've only removed the wifi password :-)

Just in case you'd like to try uploading via the arduino IDE yourself:

  1. Tou'll need to add ESP32 support. In the menu go to settings, and add these two lines under additional board manager urls:
https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json
https://dl.espressif.com/dl/package_esp32_index.json

(maybe restart the IDE)

Then under tools -> boards -> boards manager select M5Atom-stack as the device.

  1. Make sure the USB port is selected as the serial port (under tools as well)

  2. Make sure the serial monitor is closed in case it's already open. Then click on ESP32 Sketch data upload under the tools menu. This will upload the settings file to the SPIFF storage.

  3. Upload the code (arrow button in the top-left)

  4. Open the serial monitor (under tools), and you can see the serial output.

I believe I've managed to remove the double call of MMQTTDisconnected state. The run method was apparently calling it before it has switched to the new state, causing the state to be initiated twice. It now no longer crashes after the first recognition of "hey snips".

The strange thing is that the ASR stops responding for the entire system if I use the AtomEcho. The ASR also stops responding to the main microphone, although it still does hotword detection fine. A session is also created just fine.

Sometimes the ASR stops working alltogether, and sometimes it will work 50%, intermittently: after a succesfull run it will not respond the next time, until it times out, and then start responding again after that, and so forth. This seems to only happens if the AtomEcho is on the network.

The AtomEcho also seems to go into reboot loops. I'm not sure how that's even possible. It's as if it remembers that the previous time it booted up, it failed, and will continue to do so until I unplug it, and then plug it in again.

Just saw another strange situation where I disconnected the AtomEcho, and then the ASR started only listening for 1 second on the main microphone.

[11:15:07] [Hotword] detected on site azrxidia, for model hey_snips
[11:15:07] [Asr] was asked to stop listening on site azrxidia
[11:15:07] [Hotword] was asked to toggle itself 'off' on site azrxidia
[11:15:07] [Dialogue] session with id '1819212f-7fe3-40c9-83f7-26021d46f671' was started on site azrxidia
[11:15:07] [Asr] was asked to listen on site azrxidia
[11:15:09] [Asr] captured text "unknownword" in 1.0s
[11:15:09] [Asr] was asked to stop listening on site azrxidia
[11:15:09] [Nlu] was asked to parse input "unknownword"
[11:15:09] [Nlu] intent not recognized for "*"
[11:15:09] [Dialogue] session with id '1819212f-7fe3-40c9-83f7-26021d46f671' was ended on site azrxidia. The session was ended because the platform didn't understand the user
[11:15:09] [Asr] was asked to stop listening on site azrxidia
[11:15:09] [Hotword] was asked to toggle itself 'on' on site azrxidia
[11:15:13] [Hotword] detected on site azrxidia, for model hey_snips
[11:15:13] [Asr] was asked to stop listening on site azrxidia
[11:15:13] [Hotword] was asked to toggle itself 'off' on site azrxidia
[11:15:13] [Dialogue] session with id '5efe57ea-f984-4ad1-8342-ba4a9d6a1e47' was started on site azrxidia
[11:15:13] [Asr] was asked to listen on site azrxidia
[11:15:28] [Dialogue] session with id '5efe57ea-f984-4ad1-8342-ba4a9d6a1e47' was ended on site azrxidia. The session was ended because one of the component didn't respond in a timely manner
[11:15:28] [Asr] was asked to stop listening on site azrxidia
[11:15:28] [Hotword] was asked to toggle itself 'on' on site azrxidia
[11:15:38] [Hotword] detected on site azrxidia, for model hey_snips
[11:15:38] [Asr] was asked to stop listening on site azrxidia
[11:15:38] [Hotword] was asked to toggle itself 'off' on site azrxidia
[11:15:38] [Dialogue] session with id '13bfee2f-4b4a-4ae0-8896-916fb8e6d27b' was started on site azrxidia
[11:15:38] [Asr] was asked to listen on site azrxidia
[11:15:40] [Asr] captured text "unknownword" in 1.0s
[11:15:40] [Asr] was asked to stop listening on site azrxidia
[11:15:40] [Nlu] was asked to parse input "unknownword"
[11:15:40] [Nlu] intent not recognized for "*"
[11:15:40] [Dialogue] session with id '13bfee2f-4b4a-4ae0-8896-916fb8e6d27b' was ended on site azrxidia. The session was ended because the platform didn't understand the user
[11:15:40] [Asr] was asked to stop listening on site azrxidia
[11:15:40] [Hotword] was asked to toggle itself 'on' on site azrxidia
[11:16:37] [Hotword] detected on site azrxidia, for model hey_snips
[11:16:37] [Asr] was asked to stop listening on site azrxidia

After that it reverted to the intermittent "ASR listens, ASR is deaf" situation.

I've tried to manually run the ASR and check it's output. Here's what happens with a "normal" call from Voco:

pi@thuis:~/.webthings/addons/voco/snips $ LD_LIBRARY_PATH=. /home/pi/.webthings/addons/voco/snips/snips-asr -u /home/pi/.webthings/data/work -a /home/pi/.webthings/addons/voco/snips/assistant -c /home/pi/.webthings/addons/voco/snips/snips.toml
[11:27:30.198765] INFO :snips_asr_hermes::handler: Using model from "/home/pi/.webthings/data/work/injections/20210209T163004178929730/inj_20210616T092026773150365/asr"
[11:27:30.332529] INFO :snips_kaldi::decode::model: Loading model v2
[11:27:31.958167] INFO :snips_asr_hermes::handler : Preparing decoder
[11:27:31.958415] INFO :snips_asr_hermes::handler : Preparing decoder
[11:28:11.557659] INFO :snips_asr_hermes::handler : Listening at site id azrxidia
[11:28:11.557826] INFO :snips_asr_hermes::handler : Listening
[11:28:11.704154] INFO :snips_asr_lib::asr        : T0       entered AsrRunner::run
[11:28:11.704224] INFO :snips_asr_lib::asr        : T0+0.000 capture started
[11:28:13.883099] INFO :snips_asr_lib::asr        : T0+2.179 endpoint detected (rule:4) frame:155 samples:39680 signal_time:2.48 rtf:0.327
[11:28:13.883973] INFO :snips_asr_lib::asr        : Source thread stop on push: "SendError(..)"
[11:28:13.884145] INFO :snips_asr_lib::asr        : T0+2.180 capture ended
[11:28:13.885827] INFO :snips_asr_lib::asr        : T0+2.182 decoder finalized
[11:28:13.894667] INFO :snips_asr_lib::asr        : T0+2.191 lookup and post-processing done
[11:28:13.894747] INFO :snips_asr_lib::asr        : decoded: [Recognition { decoded_string: "what time is it", likelihood: 1.0, tokens: Some([Token { value: "what", confidence: 1.0, time: (0.0, 1.38), range: 0..4 }, Token { value: "time", confidence: 1.0, time: (1.38, 1.4399999), range: 5..9 }, Token { value: "is", confidence: 1.0, time: (1.4399999, 1.62), range: 10..12 }, Token { value: "it", confidence: 1.0, time: (1.62, 2.31), range: 13..15 }]) }]
[11:28:13.895411] INFO :snips_asr_hermes::handler : Publishing the recognition

And this is all that happens with the AtomEcho:

[11:28:25.235052] INFO :snips_asr_hermes::handler : Preparing decoder
[11:29:24.793911] INFO :snips_asr_hermes::handler : Listening at site id ATOMECHO
[11:29:24.793989] INFO :snips_asr_hermes::handler : Listening

All the HotwordDetected state did, was to stop the stream and restart it, which I suspected wasn't needed if there wasn't on-board hotword detection being done.

It also initializes the wave header and updates the led status. I recommend not to fiddle with the status too much.

Just in case you'd like to try uploading via the arduino IDE yourself:
I do not use Arduino IDE ;)

What is this azrxidia I see in all your messages? Can you try to stop that stream?
And can you put the contents of your snips.toml?

I'd be happy to. Here's the snips.toml:
https://github.com/createcandle/voco/blob/master/snips/snips.toml

I've also stripped out the LED parts (there was an error I couldn't fix, so I just stripped it out completely). I've also removed the OTA updates, since that won't be needed either and I figured it might leave more memory.

I've re-enabled the HotwordDetected state, but the result is the same. I'll update the code on github.

've also stripped out the LED parts (there was an error I couldn't fix, so I just stripped it out completely).

If you remove the methods updateColors(int colors) and updateBrightness(int brightness) in your device ocde, then nothing will be done :)

I think you need to set this for the AudioServer:

[snips-audio-server]
bind = "+@mqtt"

That is so that the audioserver actually listens to all audio streams.
This setting is then the same as in the [snips-hotword] setting, which might clarify why the hotword is listening and the rest not.
I am not 100% sure though, but setting it to + is not a bad idea in general

I'll give it a go.

I could also add it to common? Perhaps that will help ASR to detect the stream?

I've also added a feature to Voco so that it can provide the current time through an MQTT request. I wanted to experiment with sending the timestamp in the wav header.

Something else I'm curious about: would it be possible to have the AtomEcho connect based on hostname instead of IP address? I seem to see some hints in the settings this might be possible? if so, then the main controller could infuse that hostname into the AtomEcho at the moment of uploading the code.

I could also add it to common? Perhaps that will help ASR to detect the stream?

Might be a good idea, than you should have it set for all sections

Something else I'm curious about: would it be possible to have the AtomEcho connect based on hostname instead of IP address? I seem to see some hints in the settings this might be possible? if so, then the main controller could infuse that hostname into the AtomEcho at the moment of uploading the code.

It already does if you pust a hostname instead of an IP

Hi @flatsiedatsie,

We have come a long way since any activity here.
Did you make any progress on the subject?
Maybe you can checkout my new master branch, I have just released version 7.8.

If you require some help from me, please give me a shout. Otherwise I will close this issue at some point in the future.
I have tried to get Voco running, but ran into some issues which I cannot remember and stopped

@flatsiedatsie it seems Voco is not available anymore as Addon, is that correct?
I see a Voice Contol, but that is different. It is still in the list found here:
https://github.com/WebThingsIO/addon-list/tree/master/addons

I just cannot find it in the Addon in Webthings.
Note: I am using the docker image

Voco is only available on the Raspberry Pi.

I spent considerable time on it last time, but unfortunately couldn't get the audio to be coherent enough. Unfortunately in the end I couldn't spend that much time on a 'nice to have' anymore :-(

Ah ok, that is probably the issue then. I have a Raspberry Pi available now, do you still want me to put some effort in it?
I still have the branch.

I still find this interesting, so I have installed WebThings and could now indeed install voco.
Let's see if I can run it with an USB mike and a speaker and go from there :)

Sure, that would be wonderful! If you live in Amsterdam I can supply you with a good USB mic if you want :-)

I've uploaded the latest version of the code I was working on here:
https://github.com/createcandle/voco-mini-satellite

It would be great if you could try this Arduino workflow (Arduino IDE), because if that works, then it will be possible too flash the code to user devices via the Candle Manager addon for the Webthings Gateway.

Sure, that would be wonderful! If you live in Amsterdam I can supply you with a good USB mic if you want :-)

hehe, nope. Some good 200km drive north. But I got one :)

I have installed WebThing and VoCo on a Pi. When I type "tell me the time", I expected to have audio output. The correct text appears. Is my expectation incorrect? I have set the output to headphone. speaker-test works

ok, apparently I was expecting that incorrect. I got voco running on a Pi now and it is working :)
No to see if I can get this running

I thought the issue might be caused by the low energy from the M5 so I tried my matrixvoice.

I get this error:

2021-12-04 09:27:00.626 INFO   : voco: INFO:snips_hotword_lib::audio    : Audio thread for matrixvoice started
2021-12-04 09:27:00.627 INFO   : voco: INFO:snips_hotword_lib::audio    : Net and VAD thread for site matrixvoice started (vad inhibitor: true, vad messages: false
2021-12-04 09:27:00.632 INFO   : voco: ERROR:snips_hotword_lib::audio    : Error in network and VAD thread for site matrixvoice: no more audio in source

So I think it boils down to the audio again. Snips has some extra headers, it might be that this is causing that. I'll see if I can fix it

Yeah those headers, those indeed seem to be the issue.

Glad Voco is working :-) text commands only give text output (designed for quiet operation when kids are sleeping). Voice commands give voice output.