rhasspy/rhasspy3

I have the same idea as you.

cgisky1980 opened this issue · 5 comments

I have the same idea as you.
The core is the websocket API.

image

and i'm working on it

The key is to develop a set of communication protocols based on websocket.
Follow this protocol, you can code by any programming language.

I don't think we need to wake up.
The assistant should be able to distinguish whether we are speaking to him or not?
Of course, this is a challenge.
He should make intelligent judgments according to the situation and decide the next step.
it is within people's control.

You might also be interested in SEPIA: https://sepia-framework.github.io/
They have a more developed Websocket protocol.

The assistant should be able to distinguish whether we are speaking to him or not?
I think this would require specific hardware. If the microphone could judge the direction the audio is coming from, you'd have a better idea.

array microphones can do that. I think we can use a camera to judge the direction of the eyes.
Think about how you can tell if another person is talking to you.
I think we can make this interaction process more humane.

You might also be interested in SEPIA: https://sepia-framework.github.io/ They have a more developed Websocket protocol.

The assistant should be able to distinguish whether we are speaking to him or not?
I think this would require specific hardware. If the microphone could judge the direction the audio is coming from, you'd have a better idea.

THX. SEPIA looks very interesting. I will check it.
I'm interested on pusher https://pusher.com/docs/channels/channels_libraries/libraries/
However, there are still some details that do not conform to the IOT scene.
But we can refer to it to rewrite a version of the protocol.