xiph/rnnoise

Releasing RNNoise version 0.2

Opened this issue · 15 comments

This new release brings many improvements, including improved training, SSE4.1 and AVX2 optimizations, and run-time CPU detection (RTCD).

The distributed models are now trained using only publicly available datasets.

This release expects the build process to execute a arbitrary-shellscript to download an arbitrary file as part of the build process; outside of cmake, in a potentially privileged environment.

If I can find the time, I will PR fix for your Cmake or alike; I'm sorry I'm not doing it now - because it's very exciting to me this project is moving again.

But as it stands, this is a pretty big hole for practical secure CI-pipelines and reliability in isolated or secure environments.
In short - do not do this.

THAT SAID:

I'm honestly glad this project is getting some much needed attention; but please remember that it's been relied upon for various uses, on various platforms for a very long time now.

If your not considering the need to build this as a standalone Risc-V library for a non-linux operating-system; You should be.
It may need to be built without internet access.
And on windows XP; to support cable TV stations using it with Gstreamer.

And many, many more that I'm aware of.

Keep up the good work, do good things, and have pride - just don't forget the use-cases beyond your own. ;)

So there's two things here: one is the download script and the other is the actual content being fetched. In terms of the script itself, it's fairly simple to audit and see that it's not doing anything bad. It's also in git and signed, so I don't really see a trust issue compared to trusting the rest of the source code.

When it comes the the actual model data being fetched, I can see how it's not as safe as the rest of the Git content (sha1 and signature). One possible way to solve that would be the equivalent to what was just added to Opus:
xiph/opus@9faf6f071

The idea Is that Git will hold the sha256 of the downloaded content and thus it can check that the checksums actually match before using it. That way compromising the server will just result in a build failure.

The idea Is that Git will hold the sha256 of the downloaded content and thus it can check that the checksums actually match before using it. That way compromising the server will just result in a build failure.

This is without question an acceptable solution - I come from a version of linux called NixOS; our package manager does this for all external downloads as part of the build step; ( which is how I came to be here: NixOS/nixpkgs#304433 )

The shellscript itself is an acceptable compromise if an external download is absolutely required; in my personal ideal external data would be fetched by git submodule (providing a solid revision control), or the dependency treated as such.

It seems the model for RNNoise has always been a dependency;

@petterreinholdtsen Do you have alternatives to suggest for distributing a 20 MB binary file? In version 0.1, the model was much smaller (and it still wasn't great) and didn't include the binary model (which is the preferred form for making modifications in that case).

All package managers I know of have the ability to specify multiple source packages - I'd expect them to specify the model data as one of the sources. The nixpkgs approach seems good.

Having a hash of the data in git should be enough for revision control to ensure that it has the same version of the model as the code was designed for. Submodules remain tricky to use, especially because the model is large and most people would want a shallow clone of it.

OK, the download process should be safer now as the download script will verify the sha256 before using the downloaded binary file.

[Thomas Daede]
[SNIP]
Note, for release tarballs, such approach will break the deserted island test of free software.
[SNIP]
-- Happy hacking Petter Reinholdtsen

I have to agree with this statement - although one would hope that the dependency will find it's way onto a debian disk too ;)

Unless I missed something, the RNNoise 0.2 passes the desert island test fine. The download mechanism only applies to Git.