alibaba/AliOS-Things

[SoC2019]Speech algorithm based on AliOS Things

Closed this issue · 12 comments

  • 描述
    目前大多数语音识别系统都将语音数据发送到云端处理,不可避免地会带来网络延时、数据安全等问题,作为对比,设备端本地的语音识别具有高可靠性、高安全性和迅速响应等特点。另一方面,AliOS Things通常运行在CPU时钟频率较低且存储资源非常有限的MCU上,因此对算法本身的复杂度、资源占用和准确性提出了很高的要求。

  • Description
    At present, most voice recognition systems send voice data to cloud for processing, which inevitably brings problems such as network delay and data security. In contrast, local voice recognition on the device side has advantages such as high reliability, high security and rapid response. On the other hand, AliOS Things usually runs on an MCU with a low CPU clock frequency and very limited storage resources, so it puts high demands on the complexity, resource consumption and accuracy of the algorithm itself.

  • 目标
    基于AliOS Things开发离线语音能力,包括但不限于语音识别、声纹识别、语音合成等。

  • Goal
    Develop offline voice capabilities based on AliOS Things, including but not limited to speech recognition, voiceprint recognition, speech synthesis, and more.

  • 参考
    基于AliOS Things的声纹识别算法
    基于AliOS Things,在设备端MCU上实现一种声纹识别算法,可以根据1段预先录音的PCM格式的录音数据,判断正在录音的声音是否是同一个人的声音。预录的PCM格式的录音时长在10秒左右,要求录音结束后5秒左右可以判断出结果,判断的准确率在80%以上。算法不限,即使用可传统的语音识别,亦可集成某种轻量级AI框架,但要求在设备端MCU上运行,不能将语音数据上传到云端进行判断。

  • Reference
    Voiceprint recognition algorithm based on AliOS Things
    Based on AliOS Things, a voiceprint recognition algorithm is implemented on MCU side, and it is possible to determine whether the sound being recorded is the same person's voice based on the recorded data in PCM format of the pre-recorded voice. The recording time of the pre-recorded PCM format is about 10 seconds, and the result can be judged about 5 seconds after the end of the recording, and the accuracy of the judgment is over 80%. The algorithm is not limited, that is, the traditional speech recognition can be used, or a certain lightweight AI framework can be integrated. But it is required to run on MCU side, with the voice data not being uploaded to cloud.

MRNIU commented

Hello, I would like to working on this idea.
There are some questions:

  • Could I write my proposal in Chinese?
  • Will the mentor give advice on the revision of the proposal
  • Is there any chance to modify proposal after submission?
  • Is it necessary to submit PR for this project before?

我不能保证工作时间所以不能申请ASoC,但是我很喜欢这个提案,并且有一些语音算法的经验. 我可以为这个idea免费贡献代码么?还是说只有ASoC student才能贡献代码?
I'd like to contribute to this idea for free. Because I've not enough time for ASoC. Could I?

Hello, I would like to working on this idea.
There are some questions:

  • Could I write my proposal in Chinese?
  • Will the mentor give advice on the revision of the proposal
  • Is there any chance to modify proposal after submission?
  • Is it necessary to submit PR for this project before?

@MRNIU Hi 非常感谢关注和参与
#1 proposal中英文均可,形式不重要
#2 在提出proposal后mentor会review,也会就其中的某些点与你沟通帮助你改进
#3 你首次提交的proposal一定不会是最终的方案,所以改动是肯定的
#4 之前是否有PR不重要,重要的是这次项目本身

我不能保证工作时间所以不能申请ASoC,但是我很喜欢这个提案,并且有一些语音算法的经验. 我可以为这个idea免费贡献代码么?还是说只有ASoC student才能贡献代码?
I'd like to contribute to this idea for free. Because I've not enough time for ASoC. Could I?

很高兴看到你对这个issued的关注和兴趣,这次活动能否参与都不影响您对AliOS Things的贡献,我们一如既往地非常欢迎

MRNIU commented

@Hunter1990 感谢回复,我会在近期提交 proposal 到指定页面

你好,我对提案中提到的 『声纹识别』非常感兴趣,但是我本人没有任何关于语音处理方面的经验。现在有如下几个问题:

  • 如果想参加到这个项目中,能提一下解决『声纹识别』这个问题的一点方向吗?
  • 提案中提到的算法实现?那具体是自己设计出某种算法,还是利用现有的算法在 MCU 上面实现一次呢?
  • 从理论上来说,在低性能的 MCU 上面实现语言处理可行吗?

我认为技术性问题目前还不适合在这里讨论, 更适合于查资料(baidu, google)或者亲自尝试. :)
因为这些问题正是ASoC应当解决的.

你好,我对提案中提到的 『声纹识别』非常感兴趣,但是我本人没有任何关于语音处理方面的经验。现在有如下几个问题:

  • 如果想参加到这个项目中,能提一下解决『声纹识别』这个问题的一点方向吗?
  • 提案中提到的算法实现?那具体是自己设计出某种算法,还是利用现有的算法在 MCU 上面实现一次呢?
  • 从理论上来说,在低性能的 MCU 上面实现语言处理可行吗?

可以使用开源的算法,或者自己设计的算法,最终在MCU上实现功能即可。
目前有一些在MCU上实现的语音处理算法,可以google一下。对于该提案中的声纹识别,由于只要求匹配1个人的声音,且时间范围较大,理论上是可行的。另外,可以网络上搜索一下,有一些在单片机或者MCU上实现类似的功能论文可以参考。

MRNIU commented

仔细研究这个问题后我觉得这个 idea 不是很适合我。
想搞这个的同学加油

有任何针对这个idea的提案被接受了么?

有任何针对这个idea的提案被接受了么?

报名已截止,方案正在评审中

血亏... 今天才发现原来有声学模型的项目.... 藏得好深啊....