WebAssembly/WASI

Use case: Streaming audio (to web platform)

guest271314 opened this issue · 8 comments

Charter https://github.com/WebAssembly/WASI/blob/master/Charter.md#webassembly-system-interface-subgroup-charter
includes the goal

  • APIs for graphics, audio, input devices

System architechture 32-bit, OS Linux.

The use case is capturing system audio output, which is currently not possible at Chromium browser without using pavucontrol or other workaround for https://chromium-review.googlesource.com/c/chromium/src/+/1064373/.

Created one of several workarounds using Native Messaging. The problem is there is no simple way to stream output from bash to JavaScript.

Given execution of a bash script

parec--raw -d alsa_output.pci-0000_00_1b.0.analog-stereo.monitor | split -d -b 512 --additional-suffix=".s16le" --suffix-length=8 - ../app/""

AudioWorkletProcessor executes process() approximately 344-384 times per second. Am able to parse the files to Float32Arrays, however, process() executes faster than writing and reading the 512 byte files to local filesystem, leading to no data being available due to all files being read (and deleted) while capturing and writing is still ongoing; and the message propagation to and from Native Messaging (which is text based) also takes time, e.g., race-conditions, in brief

Native Messaging extension code where externalPort is a port to an arbitrary web page

const id = 'native_messaging_file_stream';
const port = chrome.runtime.connectNative(id);
chrome.runtime.onConnectExternal.addListener(externalPort => {
     const handleMessage = message => {
         console.log(message);
         externalPort.postMessage(message);
         if (message.done) {
           chrome.runtime.reload();
         }
     };
     port.onMessage.addListener(handleMessage);
     externalPort.onMessage.addListener(nativeMessage => {
       console.log({nativeMessage});
       port.postMessage(nativeMessage);
     });
});

Nodejs Native Messaging host

#!/usr/bin/env node
// https://github.com/simov/native-messaging
const sendMessage = require('./protocol')(handleMessage);
function handleMessage (req) {
  const {exec} = require('child_process');
  if (req.slice(0, 5) === 'parec') {
    sendMessage({done: false, req});
  }
  const nativeMessage = exec(req);
  nativeMessage.stdout.on('data', data => {
    // sendMessage(data);
  });    
  nativeMessage.stderr.on('data', err => {
    // sendMessage({err});
  });  
  nativeMessage.on('close', data => {
    if (req === 'killall -9 parec') {
      sendMessage({done: req === 'killall -9 parec', req});
    }
  });  
};

Part of JavaScript on the web page that reads the files written to local filesystem by bash script (parec and split - split is used because it is not currently possible to read a file while being written at local file system using Native File System without reading the entire file). The ReadableStream is transferred to AudioWorkletProcessor where read and Float32Array are pared from the raw data.

    const readable = new ReadableStream({
      async start(c) {
        controller = c;
        return await new Promise(resolve => setTimeout(resolve, 1000));
      },
      async pull(c) {
        try {
          let fileHandle = await dir.getFileHandle(`${n}.s16le`.padStart(14, '00000000'), {
              create: false,
            });
          // console.log(fileHandle);
          let fileBit = await fileHandle.getFile();

          c.enqueue(await fileBit.arrayBuffer());
          
          // comment to not remove file from local filesystem after creating array buffer view of file
          await dir.removeEntry(fileBit.name);
          
          ++n;
        } catch (err) {
          // handle DOMException, TypeError network, file not found
          // at request for file n+1 that does not exist in directory
          if (err instanceof DOMException || err instanceof TypeError) {
            console.warn(err);
          } else {
            console.error(err);
          };
          c.close();
        };
      },
    });
    aw.port.postMessage({ readable }, [readable]);

Craeted a WebAssembly.Memory instance for a version of the AudioWorklet that is used to output audio, however, Memory.grow() has issues re JavaScript and TypedArrays, so we cannot being at 1 page, then grow the SharedArrayBuffer to suit the use case of capturing an arbitrary amount of audio data tc39/test262#2719, we need to allocate N amount of memory beforehand.

Is it possible to use WASI to write the stream directly to a single Memory, grow when necessary in WASI/WebAssembly then read that memory to produce audio?

Basically, what am considering is using this code https://github.com/guest271314/AudioWorkletStream/blob/shared-memory-audio-worklet-stream/index.html where

          const memory = new WebAssembly.Memory({
            initial: Math.floor(length / 65536) + 1,
            maximum: Math.floor(length / 65536) + 1,
            shared: true,
          });

is substituted for

          const memory = new WebAssembly.Memory({
            initial: 1
            maximum: Math.floor(length / 65536) + 1,
            shared: true,
          });

in WASI - and the system audio capture is performed in WASI - and the single Memory is directly readable in AudioWorkletGlobalScope, eliminating the need to write N files and use messaging altogether.

The behavior of the core wasm memory.grow instruction and its relationship with JS' SharedArrayBuffer, TypedArray, and AudioWorkletGlobalScope are out of scope for WASI.

Also, WASI is not currently implemented in browsers, so if your main use case is that you want to do something in a browser and can't due to limitations of browser APIs, WASI won't be a very efficient venue for addressing your use case.

The use case is running WASI (any code) locally where STDOUT is written to shared memory.

Yes, to overcome browser limitations is the goal.

Native Messaging has limitations. Specifications and browser API's have limitations, some per specification, some per implementation decisions.

Right now have no evidence to formulate a rational conclusion either way. Here, test code at least thousands of times before determine if the concept is viable or not, in brief according to the principles at https://gist.githubusercontent.com/guest271314/1fcb5d90799c48fdf34c68dd7f74a304/raw/c06235ae8ebb1ae20f0edea5e8bdc119f0897d9a/UseItBreakItFileBugsRequestFeaturesTestTestTestTestToThePointItBreaks.txt.

There is no way to conclusively determine if a means of achieving a result is possible with testing.

Am here asking the question to discern where to begin testing WASI for the specific use case described, as th

What does WASI do? How is the goal of audio processing achieved using WASI?

Or, is WASI a proof-of-concept right now?

In which case wide range testing of use case should be consistent with the stated goal.

The main use cases for WASI at present are in non-Web embeddings of WebAssembly. Current WASI APIs support features such as filesystem access, clocks, random numbers, and command-line arguments. Audio processing is a goal, and it has not yet been achieved.

Well, local filesystem is a "non-Web embeddings of WebAssembly". The use case described includes filesystem access. What to do with the files WASI generates is not what asked. If you do not test your own goals when the use case is shared with you, then that guarantees it will not be achieved, unless the concept is in-house selective use-case tests, rather than testing the stated goals in all forms in order to find bugs before bugs are found in the wild. Do not do any begging, here. Ask questions directly. And since individuals and institutions are generally anti Free Speech, they would rather label, ban, censor than deal with facts and science, am already moving on to the next means of trying to achieve own goals. In the industries that purvey wares there is no waiting on any one else to do anything. Spare none immediate refutation of false claims and false advertising. If you are operating a closed-loop dialogue here, where input is to be reviewed for what we do not want, rather than how can we work together to achieve mutually beneficial goals, then your perspective is deliberately limited and incapable of being expanded beyond the closed-loop system that you have created, mathmatically proven by Godel nearly 100 years ago

Second incompleteness theorem
For any consistent system F within which a certain amount of elementary arithmetic can be carried out, the consistency of F cannot be proved in F itself.

Just don't write you are serious about WASI and audio processing if when a user in the wild brings a use case to your repository you say you have not achieved it yet. Am willing to test. Am not willing to wait on you, or any other individual or institution to do anything. The only way to achieve that goal is by testing.

Read this very carefully

... so we need people to have weird new ideas... we need more ideas to break it and make it better...

Use it
Break it
File bugs
Request features

  • Real time front-end alchemy, or:
    capturing, playing, altering and encoding video and audio streams, without servers or plugins!
    by Soledad Penadés

von Braun believed in testing. I cannot emphasize that term enough – test, test, test.
Test to the point it breaks.

  • Ed Buckbee, NASA Public Affairs Officer, Chasing the Moon

Now watch. Um, this how science works.
One researcher comes up with a result.
And that is not the truth. No, no.
A scientific emergent truth is not the
result of one experiment. What has to
happen is somebody else has to verify
it. Preferably a competitor. Preferably
someone who doesn't want you to be correct.

  • Neil deGrasse Tyson, May 3, 2017 at 92nd Street Y

@sunfishcode Have made some progress after testing. Re-phrasing the question, given a shell

$ command output.file

is it possible to do something like

$ command | WASI that writes directly to Memory (that dymanically grows) from piped STDOUT instead of to a local file

and import that same Memory into JavaScript for reading during the write?

The specifics of how wasm and JS language features interact are out of scope for WASI. This forum is for WASI development, so your question is off topic.

@sunfishcode FWIW a formal proposal https://bugs.chromium.org/p/chromium/issues/detail?id=1115640. Note that WASI mission statement and mandate is not out of scope there, as the native code or shell script can be WASI implementation of TransformStream and transferable stream; all that WASI is concerned with is that implementation, specifically exposing what is essentially output STDOUT. The JavaScript side is responsible for interacting with the "backend" (code in the directories user grants permissions to), which can be any programming language. While it would be beneficial for both "sides" to work together to achieve the expected result, as long as each "side" does what they say they are going to do, the expected result will still be achievable.