Virtual reality (VR) is an immersive technology that allows users to experience and interact with simulated environments. Meta Quest headsets are one of the most popular options for experiencing VR. Meta has created an application for these headsets called “Meta Horizon Workrooms” that provides users with an immersive environment where they can get things done. This app allows its users to create and access virtual desktops, which are virtual screens connected to their desktops or PCs. The aim of this project is to provide a Linux backend for the Virtual Desktop tool that can work with the Meta Quest 2 headset. This will allow Linux users to enjoy the benefits of VR productivity and gaming on their computers. This project targets the following environment:
- Meta Quest 2
- sway, a tiling window manager based on i3, under the distribution Regolith Linux (which is a i3 flavor of Ubuntu)
- pipewire for audio capture
Note: The work described in this post was done as part of my Google Summer of Code (GSoC) 2023 project under the organisation CCExtractor.
The Virtual Desktop tool consists of three main components: screen capture, audio capture, and remote desktop. Screen capture is responsible for capturing the frames from the user’s desktop or PC and sending them to the headset. Audio capture is responsible for capturing the audio from the user’s microphone and speakers and sending them to the headset. Remote desktop is responsible for receiving the input events from the headset and sending them to the desktop or PC.
Even though the initial target environment limits the usage to wayland-roots
based compositors, it would be wise to eventually add support for more display
backends. Similarly, we might want to support pipewire alternatives when it
comes to audio servers. To ensure future compatibility, I’ve created the traits
AudioCaptureBackend
and FrameCaptureBackend
. The virtual_desktop::Manager
will then inject the correct dependencies based on the session configuration.
When I have more info about the interface used by Horizon Workrooms, I’ll also
add an additional trait for Remote Desktop, in order to future proof the
application from potential breaking changes as well as keep open the possibility
of supporting other virtual desktop applications like Immersed (which currently
lacks support for wayland)
AudioCaptureBackend
and FrameCaptureBackend
run on their own seperate
threads and send the captured data to the Virtual Desktop Server
using
channels. The synchronization of all three components of the virtual desktop
backend will be done by the main thread running the virtual_desktop::Manager
.
The implementation of the Virtual Desktop tool is done in Rust, a systems programming language that offers high performance, memory safety, and fearless concurrency.
The FrameCaptureBackend
provides capture
which is used to capture from a
single output device. The capture can be of type CaptureType::Frame
for
screenshots or CaptureType::Stream
for screen recording.
For wayland, I use the wayland-client and wayland-protocols crates to interact
with the Wayland compositor and its protocols. The WlrFramecaspturer
implements FrameCaptureBackend
for Wayland Roots based compositors (like sway)
which is our target environment. Specifically, I use the
zwlr_screencopy_manager_v1
and zwlr_screencopy_frame_v1
protocols to request
and receive frames from the output devices. The frames are written into a
wl_buffer
, which represents a shared memory object. zwlr_screencopy_frame_v1
sends events notifying the application about the DRM_FOURCC
format that is
used as well as other meta metadata required to properly encode the frames, like
the width
, height
, stride
, etc. To test screencapture, the
WlrFramecaspturer
currently converts the frames to RGBA
and writes them into
a png file.
For audio capture, I use the simple-pulse-desktop-capture
crate, which is a
wrapper for PulseAudio API. This crate allows me to create a stream for
recording audio from any source on the system. My initial plan was to use a
pipewire based implementation, but I couldn't find any proper documentation for
this. Moreover it seems like OBS (the benchmark for screenrecording and audio
capture) also uses pulseaudio
. The audio data recieved is in the form of PCM
frames. I've tried writing this data to a wav file but the generated wav file
seems to be in an invalid format.
This part requires a lot more information about Meta Horizon Workrooms client. Most of the pieces are in place to allow for screen capture so progress would be faster once I learn more about this.
The project is still a work in progress and nowhere near completion. However, I have achieved some of the main goals and milestones of the project, such as:
- Implemented screen capture backend for sway
- Partially implmented audio capture
- Gained information on how remote desktop clients work