/Cytos

Optimized ogar project with WebGL rendering and native C++ parallelization.

Primary LanguageC++

CytosBanner

Cytos: The best performant open source agar.io server & client combined in one project.

Main UI Preview

image

  • Built with Electron and native c++ addons.
  • Highly parallelized "server".
  • Accelerated WebGL rendering.

Benchmarks

  • 12800 x 12800 map
  • 420 max cells per player
  • 1000 bots which randomly split and feed if the load is below 75% so the engine is not overloaded too much
  • 10000 pellets (not rendered if viewport is greater than a threshold but still handled)

Benchmark 1 - Desktop

Specs Result
Ryzen 7 5800x & 7900xtx, using 8 threads (bind to core) Average 50% load with ~50,000 cells being rendered at 120 FPS
Benchmark.mp4

Benchmark 2 - Laptop

Specs Result
i5-8250U (Surface Pro 7), using 8 threads (bind to thread) Still very much playable even though load jumps between 60% to 200% with ~15,000 cells being rendered at 40-60FPS
Benchmark.mp4

Architecture

There is NO networking at all in this project, but the concept of a server and client still exists:

  • A Web Worker is used as the server, and cytos-addon.node is loaded which has a native libuv loop integrated into the browser. It runs the physics solving and viewport querying at 25 TPS and a buffer is serialized and sent to the main page with self.postMessage. The addon also contains a thread pool to accelerate calculations.
  • Main page is used as the client, and another native module, gfx-addon.node, is loaded to parse the buffer got from the worker/server and store the state; then during requestAnimationFrame callback the module is called to generate the WebGL buffer.

Optimizations & Parallelism

A lot of techniques are borrowed from my older project OgarX, but there are some in-depth native optimization that was not possible in pure JS:

  • Thread pool with hardware binding,
  • Memory pooling, done in both server and client to minimize memory allocation and increase cache coherence.
  • Packed struct and aligned memory: Cell struct is exact 64 bytes, and address is aligned to 64 bytes. This enables high performance multithread with much reduced false sharing.
  • Implemented LooseQuadTree from this paper.
  • Multi-stage collision/eat solving with atomic exchanged operations instead of mutexes to avoid locking overhead.
  • Ejected cells and pellets are now stored in separate 2D grids (also thread-safe), enabling more efficient querying due to mono-size.
  • Game mode config is all templated in order for the compiler to pickup any possible optimization route. This means the entire engine is header only and build time is slower.
  • Almost every part of the engine is now parallelized, e.g. cell updating, player splits & ejects, and every physics solving stage.

All those changes brings about 3-4x single thread speedup and ~20x speedup with 8 threads.

Installer currently only available for Windows, but building for other platforms would be possible since electron and cmake-js are cross platform (see Developement section below) Windows will show a warning popup because the installer not a signed binary.

Development

Requirements:

  • cmake > 3.10
  • MSVC / gcc / clang
  • npm
  • cmake-js installed globablly with npm i -g cmake-js (or could be added as a dependency later...)
git clone https://github.com/Yuu6883/Cytos.git
cd Cytos
./addon.sh
cd source-js
npm i
npm run build
cp ../build/Release/*.node build/
npm start

Additional Features

  • Save & Restore: hit CTRL S to save a server state into a buffer stored in IndexedDB and hit ALT X to restore from it (per game mode)
  • Extensive timing & metrics: hit F1 to toggle the profiling panel.
  • Hit F4 for the ultimate easter egg:

image