Live streaming audio output
Closed this issue · 17 comments
Hi there,
I came across your library when looking for a way to stream audio from my PC via a browser.
Despite efforts with WebRTC, and modifying the codec/quality- it still sounds terrible.
I was thinking of using a library such as yours, then streaming it using opus-stream-decoder
, via https://github.com/AnthumChris/fetch-stream-audio.
Is that possible? Or can you suggest another way of achieving this?
Thanks
Despite efforts with WebRTC, and modifying the codec/quality- it still sounds terrible.
Yes, I concur. Using navigator.mediaDevices.getDisplayMedia({video: true, audio: true})
(then removing the video track) produces sub-par audio output.
I was thinking of using a library such as yours, then streaming it using opus-stream-decoder, via https://github.com/AnthumChris/fetch-stream-audio.
Is that possible? Or can you suggest another way of achieving this?
Yes, that is possible.
Currently the PCM is captured and played in the browser using MediaStreamTrackProcessor
to get the correct cadence and timestamp
for input to MediaStreamTrackGenerator
.
Output could be piped thrrough opusenc
. Interestingly I have been experimenting with comparing opusenc
and opusdec
to WebCodecs AudioEncoder
and AudioDecoder
over the past several days https://bugs.chromium.org/p/chromium/issues/detail?id=1254496#c32, and I have reached a similar conclusion relevant to quality of WebCodecs being sub-par compared to opusdec
, while my main purpose was demonstrating that opusdec
decodes to origin input sample rate, where AudioDecoder
does not, instead outputs PCM with 48000 sample rate, which Opus uses internally.
If I understand your use case correctly, are you trying to stream the system audio output, or specific playback output, to a different browser, or server?
If the use case is streming to a file the code does that using MediaRecorder
at
and
var audioStream = new AudioStream(
`parec -d alsa_output.pci-0000_00_1b.0.analog-stereo.monitor`
);
// audioStream.mediaStream: live MediaStream
audioStream
.start()
.then((ab) => {
// ab: ArrayBuffer representation of WebM file from MediaRecorder
console.log(
URL.createObjectURL(
new Blob([ab], {
type: 'audio/webm;codecs=opus',
})
)
);
})
.catch(console.error);
// stop capturing system audio output
audioStream.stop();
If I understand your use case correctly, are you trying to stream the system audio output, or specific playback output, to a different browser, or server?
Thanks.
Either output.
WebRTC kind-of handles both options (using a loopback for system audio out), and using 'shareScreen' also lets one share a Chrome tab with audio.
Ive figure out how to save the audio blob
in good quality, now I need to find a way to "stream" that blob
(even if the cost is latency of a few seconds).
Maybe a server in-between isn't even necessary?
Any ideas on how to achieve that?
now I need to find a way to "stream" that
blob
Stream to where? Locally or remotely?
Stream to where? Locally or remotely?
Remotely, via the browser.
I’m thinking p2p is out of the question given it could have several “listeners”?
I’ve successfully been able up grab the media stream, convert it to opus using a polyfill
I’m just stuck at:
- how do i transmit the “live” file?
- when listeners connect, how can it know which part of the stream to start at?
- how do i transmit the “live” file?
In AudioStream
we supply source AudioData
to MediaStreamTrackGenerator
which is an instance of MediaStreamTrack
.
You can pass the track to a WebRTC PeerConnection
to stream "peer-to-peer". WebRTC encodes to Opus by default https://plnkr.co/edit/1HsvQh08tYb24810?preview
a=rtpmap:111 opus/48000/2
- when listeners connect, how can it know which part of the stream to start at?
Given 1. is a live stream there is only one option, the remote peer must receive only the live stream.
Thanks for that
Couple questions about your example/AudioStream:
-
Being P2P, will that cause a problem when there are "a lot" of listeners on the one AudioStream? IE delay/lag?
-
If I modified your example to send the users' media screen audio instead, would I instead send a blob like the following?
const stream = navigator.mediaDevices.getDisplayMedia( {
video: true,
audio: true
} )
const blob = await stream.blob();
const buffer = await ac.decodeAudioData(await blob.arrayBuffer());
absn.buffer = buffer;
capture.src = URL.createObjectURL(blob);
})();
- Being P2P, will that cause a problem when there are "a lot" of listeners on the one AudioStream? IE delay/lag?
"a lot" is not specific. You can test, see https://webrtc.github.io/samples/src/content/peerconnection/multiple/.
- If I modified your example to send the users' media screen audio instead, would I instead send a blob like the following?
const stream = navigator.mediaDevices.getDisplayMedia( {
video: true,
audio: true
} )
const blob = await stream.blob();
const buffer = await ac.decodeAudioData(await blob.arrayBuffer());
absn.buffer = buffer;
capture.src = URL.createObjectURL(blob);
})();
No.
getDisplayMedia()
returns a Promise
, does not have a blob()
method. You can use WebRTC to connect to a remote peer. Note, Chromium implementation has a bug where getDisplayMedia()
mutes the video MediaStreamTrack
when "Tab" capture is used https://bugs.chromium.org/p/chromium/issues/detail?id=1099280.
WebRTC also has an example of sending a file with RTCDataChannel
https://webrtc.github.io/samples/src/content/datachannel/filetransfer/.
Is the use case live streaming or file transfer?
"a lot" is not specific.
Well, let's say 20 remote listeners are on a single audio track. Will that mean the "streamer" has 20 connections to upload to?
Is the use case live streaming or file transfer?
This is for live streaming, but of an audio track only.
The aim is to have someone either share their system audio output (via loopback) or a browser tab.
There are different ways to achieve the goal.
See
- https://stackoverflow.com/questions/21788218/webrtc-use-same-sdp-for-multiple-peer-connections
- https://stackoverflow.com/questions/60740054/webrtc-multi-peer-connection-3-clients-and-above
Alternatively, you can create a server, upload your stream to the server and the server can serve the file, which is perhaps what you mean by using a file.
I have been testing streaming from Firefox to Chromium to meet the "20 remote listeners" requirement using WebRTC. Does not achieve the requirement, yet. I will continue testing potential options.
Well, let's say 20 remote listeners are on a single audio track. Will that mean the "streamer" has 20 connections to upload to?
I tested using the same MediaStream
and a single WebRTC PeerConnection
. That resulted in only 1 remote peer receiving the stream.
I then created 20 PeerConnection
s and streamed the same MediaStream
from Firefox as the source to Chromium.
Were all 20 listeners receiving the audio in high (music listening) quality?
Was there any slow down or lag? Was there any server involved or was it just p2p?
thanks again
Were all 20 listeners receiving the audio in high (music listening) quality?
All 20 listeners received the audio. I streamed from Firefox to Chromium, so we have to deal with Chromium's WebRTC audio, which can exhibit sub-par quality. I did not use echo cancellation or other constraints which might improve quality.
No slow down or lag. No server involved.
Unfortunately I lost the most recent tests I was running, where I tested all 20 connections on the same page with 20 <iframe>
s (the screenshot in previous post) instead of 20 tabs. I should be able to reconstruct that version from what I did save. I use async clipboard for "signaling".
I'll post one of the working examples I saved before losing the most recent tests.
Firefox, where we can capture monitor devices using getUserMedia()
offer.html
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<script src="offer.js"></script>
</head>
<body>
</body>
</html>
offer.js
(async (_) => {
let config = { offers: [], answers: [] };
await navigator.clipboard.writeText(JSON.stringify(config));
let len = config.answers.length;
const sessions = [];
let stream = await navigator.mediaDevices.getUserMedia({
audio: true,
});
const label = 'Monitor of Built-in Audio Analog Stereo';
let [track] = stream.getAudioTracks();
if (track.label !== label) {
const device = (await navigator.mediaDevices.enumerateDevices()).find(
({ label: _ }) => label === _
);
const { deviceId } = device;
console.log(device);
track.stop();
stream = await navigator.mediaDevices.getUserMedia({
audio: { deviceId: { exact: deviceId } },
});
[track] = stream.getAudioTracks();
}
const createWebRTCPeerConnection = async (stream, track) => {
// media.navigator.permission.disabled
const webrtc = new RTCPeerConnection({
sdpSemantics: 'unified-plan',
});
sessions.push(webrtc);
[
'signalingstatechange',
'iceconnectionstatechange',
'icegatheringstatechange',
'negotiationneeded',
].forEach((event) => webrtc.addEventListener(event, console.log));
webrtc.onicecandidate = async (event) => {
// console.log('candidate', event.candidate);
if (!event.candidate) {
let sdp = webrtc.localDescription.sdp;
if (sdp.indexOf('a=end-of-candidates') === -1) {
sdp += 'a=end-of-candidates\r\n';
}
try {
config = JSON.parse(await navigator.clipboard.readText());
config.offers.push(sdp);
await navigator.clipboard.writeText(JSON.stringify(config));
} catch (e) {
throw e;
}
}
};
const sender = webrtc.addTransceiver(track, {
streams: [stream],
direction: 'sendonly',
});
const offer = await webrtc.createOffer();
webrtc.setLocalDescription(offer);
return webrtc;
};
const webtrc = await createWebRTCPeerConnection(stream, track);
try {
async function* readClipboard() {
while (true) {
try {
// dom.events.testing.asyncClipboard
const json = JSON.parse(await navigator.clipboard.readText());
if (json.answers.length > len) {
console.log(json.answers.length, len);
for (; len < json.answers.length; len++) {
sessions[sessions.length -1].setRemoteDescription({
type: 'answer',
sdp: json.answers[len],
});
}
await createWebRTCPeerConnection(stream, track);
}
yield await new Promise((resolve) => setTimeout(resolve, 1000));
} catch (e) {
console.error(e);
throw e;
}
}
}
for await (const _ of readClipboard()) {}
} catch (e) {
throw e;
}
})().catch(console.error);
answer.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<style>
body *:not(script) {
display: block;
}
</style>
</head>
<body>
<button id="capture">Capture system audio</button>
<audio id="audio" autoplay controls muted></audio>
<script src="answer.js">
</script>
</body>
</html>
answer.js
const audio = document.getElementById('audio');
const capture = document.getElementById('capture');
['loadedmetadata', 'play', 'playing'].forEach((event) =>
audio.addEventListener(event, console.log)
);
const webrtc = new RTCPeerConnection({ sdpSemantics: 'unified-plan' });
[
'signalingstatechange',
'iceconnectionstatechange',
'icegatheringstatechange',
'negotiationneeded',
].forEach((event) => webrtc.addEventListener(event, console.log));
webrtc.onicecandidate = async (event) => {
if (!event.candidate) {
let sdp = webrtc.localDescription.sdp;
console.log('candidate:', sdp);
if (sdp.indexOf('a=end-of-candidates') === -1) {
sdp += 'a=end-of-candidates\r\n';
}
try {
alert('Ready');
capture.onclick = async () => {
capture.onclick = null;
const json = JSON.parse(await navigator.clipboard.readText());
json.answers.push(sdp);
console.log(json, await navigator.clipboard.writeText(JSON.stringify(json)));
console.log(JSON.parse(await navigator.clipboard.readText()));
}
} catch (e) {
console.error(e);
}
}
};
webrtc.ontrack = ({ transceiver, streams: [stream] }) => {
console.log(transceiver);
const {
receiver: { track },
} = transceiver;
track.onmute = track.onunmute = (e) => console.log(e);
audio.srcObject = stream;
};
onload = async (_) => {
try {
text = await navigator.clipboard.readText();
const json = JSON.parse(text);
console.log(json.offers.length);
await webrtc.setRemoteDescription({
type: 'offer',
sdp: json.offers[json.offers.length - 1],
});
const answer = await webrtc.createAnswer();
webrtc.setLocalDescription(answer);
} catch (e) {
console.error(e);
}
};
Note, the signaling process of using clipboard is not ideal, due to any copy and paste during the process will result in unexpected values for SDP and non-JSON for JSON.parse()
- unless the sender and receiver understand that copy/paste should be omitted during the stream. I just used clipboard to test, based on this gist https://gist.github.com/guest271314/04a539c00926e15905b86d05138c113c.
@zzph Is this issue resolved?