Note: The following demo images do not show the collaborative mode. They will be updated soon.
Emscripten compiles C++ functions into WASM, which is then called by JS.
WASM modules run in a sandbox environment isolates from the rest of the system, hence cannot access anything outside their own memory space. In this case, JS and C++ cannot directly share the same memory space or allow direct memory manipulation. All "communication" is achieved through allocating some memory, then copying data from that memory space. This is a security feature that prevents malicious code from accessing sensitive resources or interfering with other parts of the application.
Emscripten is an open-source compiler toolchain that allows you to compile C and C++ code into WASM, so it can run effectively in web browser.
cd emsdk
source emsdk_env.sh
Run the following in the command line:
emcc image_processor.cpp \ -o image_processor.js \ -s MODULARIZE=1 \ -s 'EXPORT_NAME="Module"' \ -s EXPORTED_FUNCTIONS='["_monochrome_average", "_monochrome_luminosity", "_monochrome_lightness", "_monochrome_itu", "_gaussian_blur", "_edge_sobel", "_edge_laplacian_of_gaussian", "_data_to_layer", "_bucket_fill", "_merge_layers", "_quad_compression", "_malloc", "_free"]' \ -s EXPORTED_RUNTIME_METHODS='["ccall", "cwrap", "HEAPU8"]' \ -s ALLOW_MEMORY_GROWTH=1 \ -O2
Run local server:
peerjs --port 9000
Select "Go Live" on VS code.
It compiles the C++ file and outputs image_processor.wasm (the compiled WASM) and image_processor.js (the JS wrapper). _malloc and _free allows JS to allocate and free memory in WASM. HEAPU8 is used by JS to access raw WASM memory as a Uint8Array. Memory growth in WASM is allowed if needed (such as for large images).
extern "C" is a directive in C++ that tells the compiler to use C-style naming for functions so they can be called from other languages. C++ mangles function names (adds extra info like parameter types) to support function overloading. This makes the compiler names unreadable or inconsistent for other languages.
RGB are the colours red, green, and blue respectively, each in the range 0 - 255. Alpha is the opacity/transparency, also in the range 0 - 255.
Module() is a function generated by Emscripten when compiling C++ to WASM with MODULARIZE=1. It loads the WASM module asynchronously then assigns it to the wasmModule so you can call exported C++ functions.
Both ccall and cwrap are provided by Emscripten to call C/C++ functions from JS.
Key differences:
ccallcalls function immediately, whilecwrapreturns a callable JS function.- Use
ccallfor one-off calls, usecwrapfor repeated calls / better performance.
In order to optimize for workloads with a large number of layers, this function uses the following algorithm:
Process each layer from top to bottom, pixel by pixel. For each pixel, blend its RGBA value with the existing colour at that position.
An alternative proposed implementation that does NOT work as well:
- For each pixel location (x,y) in the image, calculate its final colour layer by layer, and from top to bottom.
- Move on to the next pixel early if the current colour has alpha = 255.
This method has the same time completity, however, this method has significantly worse actual runtime. This is likely because pixels in the same layer are stored consecutively in memory. Jumping from layer to layer causes significantly more cache misses, requiring the same layer to be retrieved multiple times (up to x * y times). This destroys the CPU cache efficiency.
Select image file to upload to browser.
Implement various types, including the average method, the luminosity method, the lightness method, ITU-R BT.709 recommendations for modern digital media (such as YouTube, HDTV, or FFmpeg).
- The average method is simple but does not reflect human visual perception as it treats all colours equally.
gray = (R + G + B) / 3 - The luminosity method is the common standard, gives best visual quality and realism. Slightly slower to calculate does to floating point math.
gray = 0.299 * R + 0.587 * G + 0.114 * B - The lightness method keeps the contrast between the brightest and darkest parts, but ignores mid-tone details.
gray = (max(R,G,B) + min(R, G, B)) / 2 - The ITU-R BT.709 reflects the modern expectations for grayscale conversion.
gray = 0.2126 * R + 0.7152 * G + 0.0722 * B
Gaussian blur uses the Gaussian function to soften an image by smoothing pixel values. It is used to reduce noise, reduce details, and to creata a smooth effect.
- The kernel size determines how many pixels are sampled around each pixel. A larger kernel can capture more of the Gaussian curve, producing a better approximation of the blur.
- The standard deviation (sigma) controls the strength of the blur. A larger sigma spreads out the weights more, causing a smoother and wider blur.
I considered making the kernal computation in gaussian_blur known at compile time, but this would require C++23 (since exp only became a const in C++23) - this is not an issue. The issue lies in the fact that the refactoring would involve using templates for the kernel size and sigma for kernal calculations, and I would have to export multiple gaussian_blur functions with fixed variants of kernel and sigma, as they will be called in JS. This introduces the issue of having to limit the sigma and kernel sizes passed by users to a few pre-defined options.
The Sobel method is a gradient-based edge detection method that computes gradient magnitudes in both x and y directions using 3x3 convolution kernels, good for emphasizing edges and reducing noise.
The Laplacian of Gaussian (LoG) method is a second derivative-based method that applies the Gaussian blur, then the Laplacian operator to detect edges at points of rapid intensity change.
Users are able to select a desired colour (RGBA, hex, or colour wheel), input an error threshold (between 0 and 1), and click the image to fill the area with the input threshold.
The error bound e between 0 and 1 indicates how "different" a pixel to be filled can be compared to the colour of the pixel selected. e = 0 indicates that the two pixels have identical colour, while e = 1 indicates that they are somewhat different. Please see implementation of pixels_within_threshold for details on error bound calculation.
Give users the option to save the image as a PNG or JPEG.
Resize an image to a smaller dimension, and applies a quad tree compression algorithm. This algorithm recursively divides an image into four rectangular regions and simplifies each region based on how similar the pixel colours are.
- Start with the entire image region.
- Check if thre region is uniform in colour. If yes, replace all pixels in that region with their average colour. If no, split the region into four quadrants and repeat the process on each.
- Continue recursively until all regions are uniform or a maximum depth is reached. This method reduces image detail in visually consisten areas, and enables faster processing by skipping detailed work in simple regions.
Times the number of milliseconds taken by each operation.
Work in progress.
The current implementation supports a serverless collaboration mode, with peer to peer (P2P) communication. There exists a single leader per shared instance, coordinating the peers and ordering the events.
There exists two major types of operations that require communication:
- Image operations. When one peer uploads an image, the image data must be shared with all other peers.
- Filter operations. When one peer performs an operation to an image, only the name of the operation, the layer on which it is applied, and other parameter information (such as kernel or sigma values) are shared. This is in contrast to applying the operation locally, then sharing the resulting image information with all peers, which involves significantly more transfer of data, and is hence more prone to delays.
The diagrams below illustrate a new peer join flow, the leader election process, request timeouts, and how operations are applied and communicated.
To reduce errors produced by jitter in the network, performance benchmarking is performed in the single-user (non-collaborative) scenario.
The time is calculated as follow:
- Begin timer.
- The JS function is called. (The function then allocates shared memory, copies the necessary data into the memory, and calls into C++. The C++ function is applied, then the flow returns to JS.)
- The JS function is exited, then the timer is stopped.
- Round the runtime to the nearest millisecond, and display it.
The exact code can be found in the timeOperation function in script.js.
This is the image for which performance benchmarking is performed on:
The image is 730x946 pixels.
The following is data for the monochrome methods, each click performed back to back (without re-uploading the image in between). Numbers are shown in milliseconds.
| Click Number | Average Method | Luminosity Method | Lightness Method | ITU-R Method |
|---|---|---|---|---|
| 1 | 17 | 20 | 19 | 20 |
| 2 | 16 | 18 | 17 | 18 |
| 3 | 17 | 18 | 17 | 15 |
| 4 | 16 | 18 | 17 | 18 |
| 5 | 16 | 18 | 17 | 18 |
| 6 | 17 | 19 | 15 | 18 |
| 7 | 16 | 19 | 15 | 16 |
| 8 | 17 | 19 | 15 | 19 |
| 9 | 9 | 18 | 15 | 18 |
| 10 | 16 | 18 | 15 | 18 |
| ------- | ------- | ------- | ------- | ------- |
| Average | 15.7 | 18.5 | 16.2 | 17.8 |
The following is data for the gaussian blur, with sigma = 2 and kernel = 5, each click performed back to back (without re-uploading the image in between). Numbers are shown in milliseconds.
| Click Number | Gaussian Blur |
|---|---|
| 1 | 76 |
| 2 | 44 |
| 3 | 47 |
| 4 | 48 |
| 5 | 48 |
| 6 | 47 |
| 7 | 47 |
| 8 | 47 |
| 9 | 47 |
| 10 | 48 |
| ------- | ------- |
| Average | 49.9 |
The following is data for the edge detection methods. The LoG Method is applied with sigma = 2, kernel = 5. Each click performed back to back (without re-uploading the image in between). Numbers are shown in milliseconds.
| Click Number | Sobel Method | Laplacian of Gaussian Method |
|---|---|---|
| 1 | 27 | 58 |
| 2 | 21 | 52 |
| 3 | 21 | 52 |
| 4 | 22 | 54 |
| 5 | 20 | 53 |
| 6 | 22 | 53 |
| 7 | 21 | 52 |
| 8 | 21 | 53 |
| 9 | 21 | 53 |
| 10 | 21 | 52 |
| ------- | ------- | ------- |
| Average | 21.7 | 53.2 |
The bucket fill tool and resizing cannot be tested in the same manner, as the results would not hold significance. The bucket tool algorithm is a graph search algorithm, hence is largely dependent on the area for which it covers. Resizing changes the size of the original image, hence subsequent clicks would be applied on images of different sizes.
Save original image information. If the desired size is smaller than the original image size, apply compression algorithm on the original image with the desired height and width. If the desired size is larger than the original image size, implement an algorithm to extrapolate.
For each unmerged layer, reverting it converts it to its original uploaded image. For merged layers, reverting it converts it to all of its original uploaded image, shown in multiple layers.
Select two or more layers, and click the merge button to merge them into a single layer. The new layer will have the layer id of the lowest layer.
Drag and drop layers to reorder them, with each layer retaining its layer id information.
Give each layer a user-defined name, or rename layers.
On click of the information symbol beside each tool, users will be shown a description of what the tool does.
Images can be moved around on the canvas area.
- Use SIMD instructions for WASM
- Open CV library
- WebGL
- Undo/redo features. Involves tracking image state.
In collab mode:
- Testing, specifically for race conditions and concurrency issues from remote instructions and data
- Image load time and network calls take too long. Compress data before sending it over the network, and decompress it upon receiving.
- When a new user joins a P2P instance, have the leader send all of its current image layers to the new user.
Links to consider
Upscaling image by converting from raster to vector https://en.wikipedia.org/wiki/Image_tracing
To read
https://www.figma.com/blog/webassembly-cut-figmas-load-time-by-3x/ https://www.figma.com/blog/speeding-up-build-times/ https://silvia-odwyer.github.io/photon/demo.html
-
Browser sometimes refresh itself, losing all data. Likely cause: functions not fully loaded before uploading image, causing a refresh. Needs more investigation.
-
If the image is too large, WASM will try to reallocate memory. Then
putImageData()fails because it’s trying to draw from WASM memory that was detached after a buffer reallocation. The fix is to recreateImageDatafrom the current WASM buffer before each draw, though this will require additional memory. Another option is to only allow "small" images.












