Want to suggest a wake word? Leave your thoughts here. (AIS-1441)

Question

Want to suggest a wake word? Leave your thoughts here. (AIS-1441)

feizi opened this issue a year ago · 133 comments

Hi all,

We're excited to offer the community more free and high-quality wake word models. Everyone has their own unique wake word preferences. Now, we're ready to regularly release some of the most popular wake words. Please let us know the wake words you want! English and Chinese are both welcome.

In the past, it was an expensive process to collect high-quality human speech data. But now, our team has developed a cost-effective way to train wake word models by using only TTS samples, which reaches 90-95% accuracy compared to models trained by human-recorded samples.

The wake word models and esp-sr have the same license and are free for commercial use. If you want a more accurate and exclusive wake word, please use our wake word customization service.

Currently, we support over 20 wake words. You can choose any one wake word to test. Starting from August 1, 2024, to get a new wake word, you'll need to meet one of these requirements:

If you've got an ongoing project, kindly attach the project link along with a brief overview when submitting your request.
Your wake word has been liked or upvoted by more than five people.

We are preparing to upgrade to a new TTS model and generate some wake word models with better performance.

Answer 1 · 2023-12-14T11:22:48.000Z

The Willow team and community would love "Hey Willow". It's our domain name because we've been waiting for this.

Thank you very much for offering this option, it's very exciting!

Answer 2 · 2023-12-14T12:12:14.000Z

The Willow team and community would love "Hey Willow". It's our domain name because we've been waiting for this.

Thank you very much for offering this option, it's very exciting!

I'm glad you like this. Since "hey" and "hi" sound pretty similar, sometimes people might not really notice the difference. So, I was thinking, maybe we could support both "hey willow" and "hi willow" for waking up the device. That way, whether you say "hey willow" or "hi willow", it'll still work. Of course, when we release the wake word model, we'll call it like "wn9_heywillow". What do you think about that?

Answer 3 · 2023-12-14T12:18:10.000Z

Good idea!

My only concern would be overall reduced accuracy (wake reliability vs false wake). We've noticed quite a bit of false wake with Alexa. From what I've read the automated TTS approach has 90-95% the accuracy of the models trained on human samples. I like "two word" wake words because they tend to improve accuracy, I suspect a 100% "Hey Willow" wake word could result in equivalent or even improved accuracy with the TTS approach vs even human sample trained Alexa?

Of course we could always test this, even starting with a pure "Hey Willow" model, a pure "Hi Willow" model, and a merged model.

Thanks again for offering this!

Answer 4 · 2023-12-14T12:29:28.000Z

Your concern may indeed happen. We will generate two words and test which model performs better.

Answer 5 · 2023-12-28T07:39:51.000Z

"hey/hi willow" model:
Model name: wn9_heywillow_tts
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 88%

Test dataset description:
The FAR dataset: This dataset contains a total of 64 hours of audio data, which includes audio collected from the internet and audio recorded using esp32-korvo boards.
The RAR dataset: This dataset is generated by multiple commercial TTS APIs, with a total of approximately 500 samples. These data and models were not used in the training process. However, due to the differences between TTS samples and human samples, please exercise caution when referring to the test results.

Answer 6 · 2023-12-29T06:48:21.000Z

Guys, what you are doing is really great. We have created a smart speaker called Homai based on the esp32-s3. We trained the model ourselves, but it is resource-intensive and not so easy to integrate into the pipeline. Could you please add support for our word Homai [ho'mai]? Thank you in advance!

Answer 7 · 2024-01-03T02:54:08.000Z

Hi @AigizK ,
The syllable of Homai only has two. It is difficult to reduce the probability of false triggering for monosyllabic and disyllabic phrases. We recommend selecting a 3-5 syllable phrase as the wake word.

Answer 8 · 2024-01-03T13:47:49.000Z

Hi @sun-xiangyu
We have already launched a project with this name, so we can't change it significantly. But can we use the variant "homa ai", where the sound 'A' is pronounced long?

Answer 9 · 2024-01-04T03:57:37.000Z

We have already launched a project with this name, so we can't change it significantly. But can we use the variant "homa ai", where the sound 'A' is pronounced long?

I'm sorry that our TTS model cannot specify a syllable to extend its pronunciation at the moment. This means that we cannot generate a large number of accurate “homa ai” phrases.

Answer 10 · 2024-01-09T06:18:14.000Z

Hi! Thank you for this awesome solution! We are developing a smart voice assistant called Sophia. Would it be possible to have the wake word "Hi Sophia"? This would help our user experience drastically. Thank you in advance!

Answer 11 · 2024-01-09T07:30:23.000Z

Hi @PrathamG , I'm glad you like it. "Sophia" sounds like a wake word that can be used directly. I mean, maybe we don't need an extra prefix "Hi". I suggest we start with just "Sophia". If the performance is not satisfactory, then we can train another one with "hi Sophia". What do you think?

Answer 12 · 2024-01-09T08:32:46.000Z

Sure, that sounds like a good plan! We can use only "Sophia" and test the performance first. Thank you

Answer 13 · 2024-01-09T09:03:47.000Z

If possible, I also wanted to request the wake word "Little Sophia". We are still unsure about which wake word to use, and having both options will help us determine this via user testing.

Answer 14 · 2024-01-10T03:58:23.000Z

If possible, I also wanted to request the wake word "Little Sophia". We are still unsure about which wake word to use, and having both options will help us determine this via user testing.

Now our computing resources are limited. This project can generate about two wake word models in a month. So we will choose some popular wake words. Of course, if we have some free time, "Little Sophia" is also fine.

Answer 15 · 2024-01-12T04:16:19.000Z

No worries, totally understandable! Looking forward to testing out the "Sophia" wake word

Answer 16 · 2024-01-18T09:33:51.000Z

"Sophia" model: wn9_sophia_tts

FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 97%

Answer 17 · 2024-01-22T12:02:14.000Z

“小美” or “小美同学” would be a perfect choice. It will suit a lot of use case. We all want wake word like a human name.

Answer 18 · 2024-01-23T06:25:03.000Z

@xygh, “小美同学” sounds good.

Answer 19 · 2024-01-23T07:53:43.000Z

"Sophia" model: wn9_sophia_tts

FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 97%

Thank you! We will test it out and report the results by next week

Answer 20 · 2024-01-23T09:55:20.000Z

@xygh, “小美同学” sounds good.

BTW, “你好小美” is also a perfect choice.

Answer 21 · 2024-01-25T05:47:26.000Z

"小当家" or "Hi 小星" is preferable wake word in our scenario. Thanks a lot!

Answer 22 · 2024-01-25T11:17:15.000Z

The second version "Sophia":
model info: wakenet9l_tts1h8v2_Sophia_3_0.647_0.649

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

Improvement:
Add "Sophie" and "Sophy" as hard negatives to reduce false triggers.

Answer 23 · 2024-01-25T11:28:18.000Z

"小当家" or "Hi 小星" is preferable wake word in our scenario. Thanks a lot!

Both of these words sound good. If you have no preference, we will choose "hi 小星".

Answer 24 · 2024-01-30T06:12:20.000Z

"小美同学"
model info: wakenet9l_tts1h8_小美同学_3_0.633_0.644

FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

Answer 25 · 2024-02-11T15:39:02.000Z

Hello! This is a great opportunity I was hoping would come up, I'm so glad this is now possible! I've seen that the wake-words "Mycroft" and "Hey, Mycroft" are very popular in the community, and it is also the name of my product so would very much improve user experience. Would it be possible to have either of these trained and released for the community? Thank you so much in advance for this!

Answer 26 · 2024-02-19T03:23:29.000Z

@lewardo, I'm glad it could help you. Although "Mycroft" is simpler, it seems there are quite a few words that sound similar, so I'll prioritize training with "Hey Mycroft."

Answer 27 · 2024-02-19T08:35:46.000Z

@Henry586 ,

Hi,小星: wakenet9l_tts1h8_Hi,小星_3_0.626_0.630

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 93%

Answer 28 · 2024-02-27T23:44:30.000Z

I'd love to have "hey printer" available as a wake word/phrase.

Answer 29 · 2024-02-29T15:01:26.000Z

I want to suggest a wake word ,"小龙小龙".
I'm glad to hear that you can create a wake word.

Answer 30 · 2024-03-04T11:21:02.000Z

@lewardo , The performance of "Mycroft" also looks good. Pls try.
Mycroft: wakenet9l_tts1h8_Mycroft_3_0.625_0.629

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 96%

Answer 31 · 2024-03-18T02:16:35.000Z

您好！我们正在开发一款名为喵喵同学的智能语音助手。是否帮我们实现一个“喵喵同学”的唤醒词，这将极大地帮助我们提升用户体验。先谢谢您！

Answer 32 · 2024-03-18T06:57:26.000Z

Hey,Printer: wakenet9l_tts1h8_Heyprinter_3_0.623_0.629

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 78%

Answer 33 · 2024-03-18T06:58:12.000Z

小龙小龙: wakenet9l_tts1h8_小龙小龙_3_0.624_0.628

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

Answer 34 · 2024-03-19T09:49:43.000Z

@sun-xiangyu Ooh, thank you! I'll try to test Hey Printer on the weekend!

Answer 35 · 2024-03-21T12:40:13.000Z

Is there also a chance to get wakewords trained for the ESP32 variante (WakeNet5 if I understand it correctly)? Or is that obsolete? I would love to use an "Alexa" or "ok nabu" model on my M5Stack Atom Echos, which unfortunately only have an ESP32.

Answer 36 · 2024-03-22T09:22:29.000Z

Hi @jhbruhn ,

Yes, WakeNet5 has been deprecated, and we are not training any WakeNet5 models. The ESP32 should be able to run WakeNet9, but we have not yet adapted it. This is because if you want to develop a wake word app with stable performance, it requires running in conjunction with the Audio Front End (AFE). It is diffcult for ESP32 to run in real time. Therefore, we recommend using the ESP32-S3.

Answer 37 · 2024-03-24T14:01:57.000Z

小龙小龙: wakenet9l_tts1h8_小龙小龙_3_0.624_0.628

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 95%
thank you ,i will try it.

Answer 38 · 2024-03-27T02:11:50.000Z

喵喵同学: wakenet9l_tts1h8_喵喵同学_3_0.644_0.648

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

Answer 39 · 2024-03-27T02:17:44.000Z

喵喵同学: wakenet9l_tts1h8_喵喵同学_3_0.644_0.648

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 95%

真是杰出的工作~这对我们来说意义重大！
太感谢您们了！
我们会尽快试用。

Answer 40 · 2024-03-31T05:08:11.000Z

您好！我们打算做一款玩具，有没有机会帮我们实现一个“hi, Joy”的唤醒词？我们非常喜欢这个唤醒词，非常希望能够在esp上实现。@sun-xiangyu

Answer 41 · 2024-04-10T06:19:27.000Z

Hi/Hey, Joy: wakenet9l_tts1h8_Hi,Joy_3_0.631_0.633

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 96%

Answer 42 · 2024-04-14T07:45:44.000Z

Hi/Hey, Joy: wakenet9l_tts1h8_Hi,Joy_3_0.631_0.633

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 96%

太感谢您了。
但是能否告知一下我应该怎么进行测试吗？
我使用了最新的的sr-1.7.0组件后，menuconfig里边并没有hijoy的选项。
我不知道应该怎么自定义唤醒词。
我看了下sr的文档，里边也并没有提到自定义的唤醒词，在代码方面怎么实现。

Answer 43 · 2024-04-15T02:40:35.000Z

因为不是每添加一个唤醒词，都会release 一个新的版本。就是说你下载的版本还没有添加Hi,Joy唤醒词，你可以选择
手动的下载esp-sr master branch, 覆盖之前的esp-sr,就可以找到 Hi,Joy 唤醒词了

Answer 44 · 2024-04-18T02:36:31.000Z

喵喵同学: wakenet9l_tts1h8_喵喵同学_3_0.644_0.648

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 95%

@sun-xiangyu
一：
非常感谢。最近测了下喵喵同学的语音唤醒，发现喵喵跟同学两个词要在说的时候明显分开，唤醒成功率才会比较高。如果连在一起效果会差不少。
请问是否有机再帮忙实现下“Hi，喵喵"这个唤醒词？我觉得这个唤醒词的效果应该会好很多。

二：
另外，提供一个建议哈，就是使用TTS语音训练的唤醒词模型，我发现这类模型对发音的准确度要求很高，
大概率是因为TTS的语音的发音都很标准，所以导致训练出来的模型也需要很高的发音准确度才能唤醒。
但是一般人的发音的准确度没有那么高，所以导致训练出来的模型语音识别率并不高。

我在测试sophia以及hi, joy的时候，这个感觉很明显，只有发音非常准的时候才能成功识别，发音差一点就识别失败。
所以我提供的建议是，对于有一些发音比较难的词，如果用TTS训练的话，
可以考虑在训练素材里边增加一些的近音词来一同训练，这样应该可以大大提高识别率。
不知道我对TTS训练的思路的理解对不对哈，仅供参考。

Answer 45 · 2024-04-18T03:23:08.000Z

@welkinchan
非常感谢你的反馈。

可以再训练 “Hi, 喵喵” 这个唤醒词
在唤醒的准确度和区分度上，我们需要去做一个权衡。比如对于Sophia，我们特意添加了近音词作为他的负样本来防止他被近音词唤醒。如果不是特别的需求，比如口音和方言等问题，我们一般不会添加近音词，因为这会在某种程度提高模型的误唤醒率。

Answer 46 · 2024-04-23T13:47:43.000Z

We have a couple of 3D printers @CCHS-Melbourne called like this:

Wanda
Cosmo
Hey Wanda
Hey Cosmo

Would it be possible to train it ourselves? Is there documentation on how to train with i.e a H100 or Nvidia 4090?

The closest description about the data structures and training process I've seen is here:

https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/wake_word_engine/README.html

But I cannot easily find a program/pipeline to run locally with all the input .wav files?

/cc @adricl @GoatNote

Answer 47 · 2024-04-24T02:31:40.000Z

Would it be possible to train it ourselves? Is there documentation on how to train with i.e a H100 or Nvidia 4090?

We can help you train some wake words, but our training pipeline isn't open-sourced yet.

Answer 48 · 2024-04-26T02:17:57.000Z

@feizi hello, can you train with the model "hey, Li Li" or "Hi, Li Li" or simply "Li Li", I'm a bit hesitant thinking about LiLi vs Li Li, whether whether it makes a big impact on accuracy. I tested this phrase with multinet7 and the results were also very positive, but I lacked a tool to evaluate accuracy.

Answer 49 · 2024-04-28T07:50:25.000Z

Hi @dnambinh

For humans, "Li Li" and "LiLi" should be the same, but for TTS (Text-to-Speech), "Li Li" might insert a brief pause.
I'm not sure how you pronounce "LiLi", whether it's like the English word "Lily" or the Chinese name "莉莉".

Answer 50 · 2024-04-28T08:43:21.000Z

Hi @dnambinh

For humans, "Li Li" and "LiLi" should be the same, but for TTS (Text-to-Speech), "Li Li" might insert a brief pause. I'm not sure how you pronounce "LiLi", whether it's like the English word "Lily" or the Chinese name "莉莉".

Yes, I have tried common tts tools and there is not much difference between "lily" and "lili". Ya, if it were a word that made sense in English it would be "lily". The initial idea was a certain wake-word that most people (speaking English - Chinese - Vietnamese - Arabic - ....) could easily say due to similarities in pronunciation. I'd be happy if you could train a similar model

Answer 51 · 2024-04-28T12:12:09.000Z

Hi @dnambinh
For humans, "Li Li" and "LiLi" should be the same, but for TTS (Text-to-Speech), "Li Li" might insert a brief pause. I'm not sure how you pronounce "LiLi", whether it's like the English word "Lily" or the Chinese name "莉莉".

Yes, I have tried common tts tools and there is not much difference between "lily" and "lili". Ya, if it were a word that made sense in English it would be "lily". The initial idea was a certain wake-word that most people (speaking English - Chinese - Vietnamese - Arabic - ....) could easily say due to similarities in pronunciation. I'd be happy if you could train a similar model

OK，I will use both "Hi,Lily" and "Hi, 莉莉" to train a wake word model.

Answer 52 · 2024-05-08T03:02:16.000Z

Hi,Lily/Hi,莉莉: wakenet9l_tts1h8_Hi,Lily or Hi,莉莉_3_0.633_0.639

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 93%

@dnambinh , looking forward to your feedback

Answer 53 · 2024-05-08T03:24:17.000Z

Hi,Lily/Hi,莉莉: wakenet9l_tts1h8_Hi,Lily or Hi,莉莉_3_0.633_0.639

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 93%

@dnambinh , looking forward to your feedback

Thanks for your support, i will feedback asap

Answer 54 · 2024-05-08T07:50:33.000Z

@welkinchan 非常感谢你的反馈。

可以再训练 “Hi, 喵喵” 这个唤醒词

在唤醒的准确度和区分度上，我们需要去做一个权衡。比如对于Sophia，我们特意添加了近音词作为他的负样本来防止他被近音词唤醒。如果不是特别的需求，比如口音和方言等问题，我们一般不会添加近音词，因为这会在某种程度提高模型的误唤醒率。

您好，请问是否可以帮忙训练一下唤醒词“Hi, 喵喵”？@sun-xiangyu

Answer 55 · 2024-05-12T15:24:32.000Z

An arabic wake word is missing . It could be any arabic name like "Sarah", "Rahma" or "Yasmeen" , or the greetings "Assalam-o-alaikum"

Answer 56 · 2024-05-14T06:25:03.000Z

Hi @usama1123456789 ,

Now we only can generate English or Chinese TTS samples. we can try to train some wake words like "Hi,Sarah" or "Yasmeen" by English TTS samples. But I'm not sure if those model can work well for Arabic.

Answer 57 · 2024-05-14T08:35:47.000Z

Hi,喵喵: wakenet9l_tts1h8_Hi,喵喵_3_0.636_0.641

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 94%

@welkinchan, please try.

Answer 58 · 2024-05-14T08:37:45.000Z

Hey,Wanda: wakenet9l_tts1h8_Hey,Wanda_3_0.641_0.644

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

@brainstorm , please try.

Answer 59 · 2024-05-14T11:09:57.000Z

An arabic wake word is missing . It could be any arabic name like "Sarah", "Rahma" or "Yasmeen" , or the greetings "Assalam-o-alaikum"

From what I've seen use of this greeting is extremely common and it's likely you will get a lot of false-wake incidents when it is so frequently used in everyday conversation within audio range of your devices.

I'd suggest you recommend an acceptable alternative.

Answer 60 · 2024-05-14T15:16:37.000Z

Hi @usama1123456789 ,

Now we only can generate English or Chinese TTS samples. we can try to train some wake words like "Hi,Sarah" or "Yasmeen" by English TTS samples. But I'm not sure if those model can work well for Arabic.

I guess English TTS trained files would be OK! For Hi Sarah, or Hi Yasmeen.

Answer 61 · 2024-05-14T15:19:33.000Z

An arabic wake word is missing . It could be any arabic name like "Sarah", "Rahma" or "Yasmeen" , or the greetings "Assalam-o-alaikum"

From what I've seen use of this greeting is extremely common and it's likely you will get a lot of false-wake incidents when it is so frequently used in everyday conversation within audio range of your devices.

I'd suggest you recommend an acceptable alternative.

You are correct! That greeting would trigger many unwanted wake-ups.

By the way, can we get one called "Hi, Astrolabe" in English, of course?

We have a device called Astrolabe and would like to get one model trained for that word!

Answer 62 · 2024-05-15T21:32:45.000Z

We are making cultural and creative products with a Harbin theme, and we hope to have some 'awakening words' like 'Hey, Xiaobin' or 'Xiaobin Xiaobin'. Could you please help us train that? Thank you.
@sun-xiangyu

Answer 63 · 2024-05-16T02:46:05.000Z

We have a device called Astrolabe and would like to get one model trained for that word!

@usama1123456789 , Astrolabe sounds complex enough to be a stand-alone wake word, and I recommend leaving it unprefixed with "Hi"

Answer 64 · 2024-05-16T05:00:42.000Z

Hi,Lily/Hi,莉莉: wakenet9l_tts1h8_Hi,Lily or Hi,莉莉_3_0.633_0.639

Perfromace: FAR(False Alarm Rate): 1 times / 8 hours RAR(Right Alarm Rate): 93%

@dnambinh , looking forward to your feedback

Hi, @sun-xiangyu. I tried model 'Hi,Lily'. The results received were extremely positive.
Perform:
+) FAR(False Alarm Rate): Compared to the 'Alexa' model, Alexa's false activation rate is clearly higher, simply by putting the device into a random conversation.
+) RAR: When I knew that the model using TTS was only about 90-95% effective compared to the model using real human voice, and RAR was about 93%, I didn't expect much. However, the result was surprising, it was surprisingly accurate, I tried with the accent of almost everyone in the company, with other nationalities, even with the local accent, which is extremely difficult to hear.

From here it can be concluded that choosing the correct wakeword is extremely important and greatly affects the results.
Test device configuration:
+) ESP32S3
+) 1MIC vs 2MIC mems

However, I'm having a new problem. The idea is to wake up the device with a wakeword, then record until the VAD checks that there is no more voice (I checked for 1 second) and then send it to the server for further processing. I noticed that the VAD cannot check the sound of people far away (2m), while the wakeword still works well. If I reduce the VAD level to 2 or 1 it gets better however it is easily triggered by other noise.
Is there any feasible solution?

Thank you and team very much

Answer 65 · 2024-05-16T12:13:51.000Z

We have a device called Astrolabe and would like to get one model trained for that word!

@usama1123456789 , Astrolabe sounds complex enough to be a stand-alone wake word, and I recommend leaving it unprefixed with "Hi"

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

Answer 66 · 2024-05-17T11:37:08.000Z

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

OK

Answer 67 · 2024-05-20T02:42:29.000Z

However, I'm having a new problem. The idea is to wake up the device with a wakeword, then record until the VAD checks that there is no more voice (I checked for 1 second) and then send it to the server for further processing. I noticed that the VAD cannot check the sound of people far away (2m), while the wakeword still works well. If I reduce the VAD level to 2 or 1 it gets better however it is easily triggered by other noise.
Is there any feasible solution?

@dnambinh, thank you for your detailed assessment. We have added both Lily (English) and 莉莉 (Chinese), which has increased the diversity of TTS samples. This may be one of the reasons for the better performance.

The VAD indeed has a lot of room for improvement. We can train a more accurate VAD using deep learning methods, but currently, we do not have a definite timeline.

Answer 68 · 2024-05-21T09:08:17.000Z

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

OK

Any idea when can we get ASTROLABE wake word?

Answer 69 · 2024-05-22T04:04:51.000Z

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

OK

Any idea when can we get ASTROLABE wake word?

two weeks later.

Answer 70 · 2024-05-22T12:36:10.000Z

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

OK

Any idea when can we get ASTROLABE wake word?

two weeks later.

Great ! and thank you very much.
One question. could this be used with esp_sr v1.6?

Answer 71 · 2024-05-23T02:37:18.000Z

ASTROLABE as a standalone would be OK and acceptable. Can we get that ?

OK

Any idea when can we get ASTROLABE wake word?

two weeks later.

Great ! and thank you very much. One question. could this be used with esp_sr v1.6?

Yes, if you don't want to update to the main branch, you need to manually copy your model to the wn9_customword and then load it in menuconfig by wn9_customword.

Answer 72 · 2024-05-23T03:08:10.000Z

However, I'm having a new problem. The idea is to wake up the device with a wakeword, then record until the VAD checks that there is no more voice (I checked for 1 second) and then send it to the server for further processing. I noticed that the VAD cannot check the sound of people far away (2m), while the wakeword still works well. If I reduce the VAD level to 2 or 1 it gets better however it is easily triggered by other noise.
Is there any feasible solution?

@dnambinh, thank you for your detailed assessment. We have added both Lily (English) and 莉莉 (Chinese), which has increased the diversity of TTS samples. This may be one of the reasons for the better performance.

The VAD indeed has a lot of room for improvement. We can train a more accurate VAD using deep learning methods, but currently, we do not have a definite timeline.

Looking forward to new features from ESP🤩🤩

Answer 73 · 2024-05-28T07:46:59.000Z

Astrolabe: wakenet9l_tts1h8_Astrolabe_3_0.625_0.632

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 94%

@usama1123456789 , please try.

Answer 74 · 2024-05-28T07:51:50.000Z

小滨小滨,小冰小冰: wakenet9l_tts1h8_小滨小滨,小冰小冰_3_0.614_0.623

Perfromace:
FAR(False Alarm Rate): 1 times / 8 hours
RAR(Right Alarm Rate): 95%

@kaylyun , the pronunciation of 小滨小滨(xiao3 bin1) and 小冰小冰(xiao3 bing1) is similar, I used these two words during the training, so you can choose either one as the wake word.

Answer 75 · 2024-05-30T06:42:53.000Z

for your detailed assessment. We have added both Lily (English) and 莉莉 (Chinese), which has increased the diversity of TTS samples. This may be one of the reasons for the better performance.

Thanks alot for this wakeword. Me and @usama1123456789 are really greatfull to you !

Answer 76 · 2024-05-30T08:28:15.000Z

I think the community should think about keywords that are more likely to be effective in multiple languages. Another example I tried recently that worked quite well with the mn9 voice command is "'my my" - "mai mai" - "麦麦" or "mimi" - "mimi" - "咪咪". The languages in order are English - Vietnamese - Chinese (sorry if my Chinese is wrong).
It is similar to the suggestion "lily" 🥇

Answer 77 · 2024-05-31T03:18:17.000Z

What about using IoT device type as the wakeword, like "hi, air condition"、"hi, purifier"、“hi, humidifier” and so on.

Answer 78 · 2024-05-31T03:27:58.000Z

What about using IoT device type as the wakeword, like "hi, air condition"、"hi, purifier"、“hi, humidifier” and so on.

@Oreobird In theory it is possible but I think it should not be. It will be fine if your room only has an air conditioner, but if there is an additional TV, water heater, cleaning robot,... everything will be very chaotic and cause difficulties for users😂

The process would be wake word -> speech command

Answer 79 · 2024-06-03T02:56:41.000Z

What about using IoT device type as the wakeword, like "hi, air condition"、"hi, purifier"、“hi, humidifier” and so on.

@Oreobird In theory it is possible but I think it should not be. It will be fine if your room only has an air conditioner, but if there is an additional TV, water heater, cleaning robot,... everything will be very chaotic and cause difficulties for users😂

The process would be wake word -> speech command

I strongly agree with your reply. The issue of waking up multiple devices has always been a key problem in voice recognition. Anyway, what I think is that using device types as wake words might be useful for demonstration or demo scenarios.

Answer 80 · 2024-06-07T03:35:29.000Z

Can you help me train, "嗨，小鱼/Little Fis"?
Thanks a million.

Answer 81 · 2024-06-10T21:21:49.000Z

Hey, it would be cool for movie-fan to have "HAL" wake word call from the movie 2001: A Space Odyssey https://www.youtube.com/watch?v=ARJ8cAGm6JE

Answer 82 · 2024-06-11T03:41:56.000Z

Hey, it would be cool for movie-fan to have "HAL" wake word call from the movie 2001: A Space Odyssey https://www.youtube.com/watch?v=ARJ8cAGm6JE

It sounds cool, but it is currently difficult for TTS to stably generate "HAL" pronunciations.

Answer 83 · 2024-06-24T06:04:06.000Z

你好小智: wakenet9l_tts1h8_你好小智_3_0.631_0.635

Perfromace:
FAR(False Alarm Rate): 1 times / 24 hours
RAR(Right Alarm Rate): 98%

This model was trained by both human recording samples and TTS samples, which has higher response accuracy and lower false alarm rate.

Answer 84 · 2024-07-05T16:07:00.000Z

Can you help with Hi Rico, Hello Rico, Rico同学？

Many Thanks！！！

Answer 85 · 2024-07-07T14:28:30.000Z

suggest wakeup word: 游戏管家

Answer 86 · 2024-07-09T02:35:55.000Z

Can you help with Hi Rico, Hello Rico, Rico同学？

Many Thanks！！！

I recommend using Hi Rico

Answer 87 · 2024-07-10T13:58:20.000Z

Can you help with Hi Rico, Hello Rico, Rico同学？
Many Thanks！！！

I recommend using Hi Rico

Geat! Thanks!

By the way, can I make the model training by my self? Dose the model provide the source and documents?

Answer 88 · 2024-07-11T02:18:30.000Z

Can you help with Hi Rico, Hello Rico, Rico同学？
Many Thanks！！！

I recommend using Hi Rico

Geat! Thanks!

By the way, can I make the model training by my self? Dose the model provide the source and documents?

The training script is not yet open source. If you want to deploy your own model, you can use esp-dl project.

Answer 89 · 2024-07-11T09:19:13.000Z

SR is a fantastic project! it highly boosts UI/UX design capabilities of our product.
Please help us to train following wakewords:
Hey, Telly
泰力泰力

Many many thanks!

Answer 90 · 2024-07-31T02:49:05.000Z

Currently, we support over 20 wake words. You can choose any one wake word to test. Starting from August 1, 2024, to get a new wake word, you'll need to meet one of these requirements:

If you've got an ongoing project, kindly attach the project link along with a brief overview when submitting your request.
Your wake word has been liked or upvoted by more than five people.

We are preparing to upgrade to a new TTS model and generate some wake word models with better performance.

Answer 91 · 2024-08-01T08:31:09.000Z

Currently, we support over 20 wake words. You can choose any one wake word to test. Starting from August 1, 2024, to get a new wake word, you'll need to meet one of these requirements:

If you've got an ongoing project, kindly attach the project link along with a brief overview when submitting your request.

Your wake word has been liked or upvoted by more than five people.

We are preparing to upgrade to a new TTS model and generate some wake word models with better performance.

Very excited to hear about the TTS model improvements along with the wake up model for better performance especially with RAR and speed.
And it would be great if you could retrain "Hi, Lily", I can report more details on the performance related changes✌️

Answer 92 · 2024-08-06T08:13:17.000Z

@blessalanou

Hi,Telly/Hi,泰力: wakenet9l_tts1h8_Hi,Telly or Hi,泰力_3_0.613_0.619

Perfromace:
FAR(False Alarm Rate): 1 times / 24 hours
RAR(Right Alarm Rate): 94%

The training data of this model is similar with Hi Lily wake word, include both English "Hi, Telly" and Chinese "Hi, 泰力".

Answer 93 · 2024-08-14T09:49:57.000Z

Is it possible for esp-sr or esp-skainet or esp-adf to provide an interface or process for custom wake-up word functionality? For example, by users recording the wake-up word and training a model via a training script deployed in the cloud, then updating it to the device.

Answer 94 · 2024-08-14T10:22:34.000Z

Is it possible for esp-sr or esp-skainet or esp-adf to provide an interface or process for custom wake-up word functionality? For example, by users recording the wake-up word and training a model via a training script deployed in the cloud, then updating it to the device.

As far as I know, not yet.
Before I knew about esp-adf and esp-sr, I used custom wakeword using a model created by python + tensorflow, then quantized the model and used tflite (tensorflow for microcontrollers). The model ran successfully, but maybe due to optimization issues, it consumed a lot of ram and memory (you can run the model on any microcontroller this way, of course if you have enough resources).
I probably would have continued to optimize using tflite until I discovered that ESP provides ESP-DL, which also allows model deployment with hardware support.
You can find out from what I say.

Answer 95 · 2024-08-14T11:11:44.000Z

Is it possible for esp-sr or esp-skainet or esp-adf to provide an interface or process for custom wake-up word functionality? For example, by users recording the wake-up word and training a model via a training script deployed in the cloud, then updating it to the device.

As far as I know, not yet. Before I knew about esp-adf and esp-sr, I used custom wakeword using a model created by python + tensorflow, then quantized the model and used tflite (tensorflow for microcontrollers). The model ran successfully, but maybe due to optimization issues, it consumed a lot of ram and memory (you can run the model on any microcontroller this way, of course if you have enough resources). I probably would have continued to optimize using tflite until I discovered that ESP provides ESP-DL, which also allows model deployment with hardware support. You can find out from what I say.

Deployment in the cloud is a bit difficult for us and is not in our plans. I think esp-dl might be a solution, if you want to deploy a model of your own, I recommend you to use it.
Good news, we are refactoring esp-dl so that esp-dl can directly load quantized models, just like you use onnx and pytorch.

Answer 96 · 2024-09-10T06:26:22.000Z

Looks like there is already a wake-up word in English: Hey,Wand. Can we have a Chinese version? e.g. 神奇魔仗

Answer 97 · 2024-09-28T09:38:32.000Z

We are using Espressif's ESP32S3 chip to create a small wizard dialogue toy that can provide great emotional value and companionship.
We look forward to your help in training the following wake words.
“Hi，小巫”

Answer 98 · 2024-09-29T02:30:29.000Z

We are using Espressif's ESP32S3 chip to create a small wizard dialogue toy that can provide great emotional value and companionship. We look forward to your help in training the following wake words. “Hi，小巫”

Sounds great, I'm happy to help train a "Hi，小巫" wake word.

Answer 99 · 2024-10-02T23:21:31.000Z

We are working on a patient-side voice assistant for the healthcare space. We desperately need help training the English branded wake word, "Hey, Henry". We are currently testing with the ESP-BOX-S3. Many thanks in advance.

Answer 100 · 2024-10-31T18:16:25.000Z

sudo or hey sudo would be cool