The Forge is a cross-platform rendering framework supporting
- PC
- Windows 10
- with DirectX 12 / Vulkan 1.1
- with DirectX Ray Tracing API
- DirectX 11 Fallback Layer for Windows 7 support (not extensively tested)
- Linux Ubuntu 18.04 LTS with Vulkan 1.1 and RTX Ray Tracing API
- Windows 10
- Android Pie with Vulkan 1.1
- macOS / iOS / iPad OS with Metal 2.2
- XBOX One / XBOX One X (only available for accredited developers on request)
- PS4 / PS4 Pro (only available for accredited developers on request)
- Switch (in development) (only available for accredited developers on request)
- Google Stadia (in development) (only available for accredited developers on request)
Particularly, the graphics layer of The Forge supports cross-platform
- Descriptor management. A description is on this Wikipage
- Multi-threaded and asynchronous resource loading
- Shader reflection
- Multi-threaded command buffer generation
The Forge can be used to provide the rendering layer for custom next-gen game engines. It is also meant to provide building blocks to write your own game engine. It is like a "lego" set that allows you to use pieces to build a game engine quickly. The "lego" High-Level Features supported on all platforms are at the moment:
- Asynchronous Resource loading with a resource loader task system as shown in 10_PixelProjectedReflections
- Lua Scripting System - currently used in 06_Playground to load models and textures and animate the camera
- Animation System based on Ozz Animation System
- Consistent Math Library based on an extended version of Vectormath with NEON intrinsics for mobile platforms
- Extended version of EASTL
- For loading art assets we have a modified and integrated version of Assimp
- Consistent Memory Managament:
- on GPU following Vulkan Memory Allocator
- on CPU Fluid Studios Memory Manager
- Input system with Gestures for Touch devices based on an extended version of gainput
- Fast Entity Component System based on our internally developed ECS
- Cross-platform FileSystem C API, supporting disk-based files, memory streams, and files in zip archives
- UI system based on imGui with a dedicated unit test extended for touch input devices
- Audio based on integrating SoLoud
- Shader Translator using a superset of HLSL as the shader language. There is a Wiki page on how to use the Shader Translator
- Various implementations of high-end Graphics Effects as shown in the unit tests below
Please find a link and credits for all open-source packages used at the end of this readme.
Join the Discord channel at https://discord.gg/hJS54bz
Join the channel at https://twitter.com/TheForge_FX?lang=en
The Forge Interactive Inc. is a Khronos member
The Forge has now support for Sparse Virtual Textures on Windows and Linux with DirectX 12 / Vulkan. Sparse texture (also known as "virtual texture", “tiled texture”, or “mega-texture”) is a technique to load huge size (such as 16k x 16k or more) textures in GPU memory. It breaks an original texture down into small square or rectangular tiles to load only visible part of them.
The unit test 18_Virtual_Texture is using 7 sparse textures:
- Mercury: 8192 x 4096
- Venus: 8192 x 4096
- Earth: 8192 x 4096
- Moon: 16384 x 8192
- Mars: 8192 x 4096
- Jupiter: 4096 x 2048
- Saturn: 4096 x 4096
There is a unit test that shows a solar system where you can approach planets with Sparse Virtual Textures attached and the resolution of the texture will increase when you approach.
Linux 1080p NVIDIA RTX 2060 Vulkan Driver version 435
Windows 10 1080p AMD RX550 DirectX 12 Driver number: Adrenaline software 19.10.1
Windows 10 1080p NVIDIA 1080 Vulkan Driver number: 418.81
Ephemeris 2 - the game Stormland from Insomniac was released. This game is using a custom version of Ephemeris 2. We worked for more than six months on this project.
Head over to Custom Middleware to check out the source code.
- The new 16_Raytracing unit test shows a simple cross-platform path tracer. On iOS this path tracer requires A11 or higher. It is meant to be used in tools in the future and doesn't run in real-time. To support the new path tracer, the Metal raytracing backend has been overhauled to use a sort-and-dispatch based approach, enabling efficient support for multiple hit groups and miss shaders. The most significant limitation for raytracing on Metal is that only tail recursion is supported, which can be worked around using larger per-ray payloads and splitting up shaders into sub-shaders after each TraceRay call; see the Metal shaders used for 16_Raytracing for an example on how this can be done.
macOS 1920x1080 AMD Pro Vega 64
iOS iPhone X 812x375
Windows 10 1080p NVIDIA RTX 2080 with DXR Driver version 441.12
Windows 10 1080p NVIDIA RTX 2080 with RTX Driver version 441.12
Linux 1080p NVIDIA RTX 2060 with RTX Driver version 435
- File System: Fixed an issue wherein compiled shader binaries weren’t being saved to the RD_SHADER_BINARIES resource directory
- GitHub issues fixed:
- #150 - [Vulkan] Failed to extend descriptor pool
- #151 - [Vulkan] rootcbv of detection is case sensitive
- #152 - [Vulkan] updateDescriptorSet is different from the DirectX12
The Forge Interactive Inc., the company behind The Forge became a Khronos Associate member.
- Ephemeris 2
- New features
- Add Earth radius: controls the radius of clouds' radius with scale factor. The clouds field will be flatter and the user can see further along the horizon if the radius increase
- Add noise flow: controls the direction and intensity of clouds' noise flow
- Add rotation: rotates clouds based on a certain pivot position.
- Add the second layer: it is possible to generate the second cloud layer which can act, independently
- Add FXAA
- Improvement
- Ray-marching: now, hard-edge artifact is significantly reduced
- Silver-lining: improved its quality
- God ray: improved its quality
- Performance: up to 25% performance increased
- New features
Click on the following image to see a video:
Head over to Custom Middleware to check out the source code.
- macOS / iPad / iOS: ICB support for Metal renderer (draw; draw indexed; pipeline state switch with ICB; ICB optimization with BlitEncoder). The Visibility Buffer example now uses ICB features on MacOS
- reduced memory consumption for argument buffers in Metal
- fixes for Metal implementation of descriptor set
- minor fixes and optimizations in Metal renderer
- Due to bugs in the run-time for argument buffers we still can't run unit test 04, 06, and 10
The Visibility Buffer example runs now faster on macOS
- New cross-platform FileSystem C API, supporting disk-based files, memory streams, and files in zip archives. The API can be viewed in IFileSystem.h, and all of the example code has been updated to use the new API.
- The API is based around
Path
s, where eachPath
represents an absolute, canonical path string on a particular file system. You can query information about the files atPath
s, open files asFileStream
s, and copy files between differentPath
s. - The concept of
FileSystemRoot
s has been replaced byResourceDirectory
s.ResourceDirectory
s are predefined directories where resources are expected to exist, and there are convenience functions to open files in resource directories. If your resources don’t exist within the default directory for a particular resource type, you can callfsSetPathForResourceDirectory
to relocate the resource directory; see the unit tests for sample code on how to do this. - There's a new 12_FileSystem unit test that demonstrates how to read files from zip archives:
- The API is based around
- Vulkan: Adaptive Order Independent Transparency with Raster Order Views is now supported when VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_SHADER_INTERLOCK_FEATURES_EXT is supported.
See the release notes from previous releases in the Release section.
-
Windows 10
-
Drivers
- AMD / NVIDIA / Intel - latest drivers
-
Visual Studio 2017 with Windows SDK / DirectX version 17763.132 (you need to get it via the Visual Studio Intaller) https://developer.microsoft.com/en-us/windows/downloads/sdk-archive
-
The Forge supports now as the min spec for the Vulkan SDK 1.1.82.0 and as the max spec 1.1.114
-
The Forge is currently tested on
- AMD 5x, VEGA GPUs (various)
- NVIDIA GeForce 9x, 10x. 20x GPUs (various)
- Intel Skull Canyon
-
macOS 10.15 beta 8 (19A558d)
-
Xcode 11.0 (11A419c)
-
The Forge is currently tested on the following macOS devices:
- iMac with AMD RADEON 560 (Part No. MNDY2xx/A)
- iMac with AMD RADEON 580 (Part No. MNED2xx/A)
- MacBook Pro 13 inch (MacBookPro13,2)
- Macbook Pro 13 inch (MacbookPro14,2)
In the moment we do not have access to an iMac Pro or Mac Pro. We can test those either with Team Viewer access or by getting them into the office and integrating them into our build system. We will not test any Hackintosh configuration.
-
iOS 13.1 beta 3 (17A5837a)
-
XCode: see macOS
To run the unit tests, The Forge requires an iOS device with an A9 or higher CPU (see GPU Processors or see iOS_Family in this table iOS_GPUFamily3_v3). This is required to support the hardware tessellation unit test and the ExecuteIndirect unit test (requires indirect buffer support). The Visibility Buffer doesn't run on current iOS devices because the texture argument buffer on those devices is limited to 31 (see Metal Feature Set Table and look for the entry "Maximum number of entries in the texture argument table, per graphics or compute function") , while on macOS it is 128, which we need for the bindless texture array.
We are currently testing on
- iPhone 7 (Model A1778)
- iPhone Xs Max (Model MT5D2LL/A)
-
iPadOS 13.1 beta 3 (17A5837a)
-
XCode: see macOS
We are currently testing on:
- iPad (Model A1893)
-
Ubuntu 18.04 LTS Kernel Version: 4.15.0-20-generic
-
GPU Drivers:
- AMD GPUs: we are testing on the Mesa RADV driver
- NVIDIA GPUs: we are testing with the NVIDIA driver
-
Workspace file is provided for codelite 12.0.6
-
Vulkan SDK Version 1.1.101: download the native Ubuntu Linux package for all the elements of the Vulkan SDK LunarG Vulkan SDK Packages for Ubuntu 16.04 and 18.04
-
The Forge is currently tested on Ubuntu with the following GPUs:
- AMD RADEON RX 480
- AMD RADEON VEGA 56
- NVIDIA GeForce 2070 RTX
-
Android Phone with Android Pie (9.x) for Vulkan 1.1 support
-
Visual Studio 2017 with support for Android API level 28
At the moment, the Android run-time does not support the following unit tests due to -what we consider- driver bugs:
- 04_ExecuteIndirect
- 07_Tesselation
- 08_Procedural
- 09a_HybridRayTracing
- 10_PixelProjectedReflections
- 12_RendererRuntimeSwitch
- 14_WaveIntrinsics
- 15_Transparency
- 16_RayTracing
- 16a_SphereTracing
- Visibility Buffer
- We are currently testing on
- Samsung S10 Galaxy (Qualcomm Adreno 640 Graphics Cardv(Vulkan 1.1.87)) with Android 9.0. Please note this is the version with the Qualcomm based chipset.
- Essential Phone with Android 9.0 - Build PPR1.181005.034
-
For PC Windows run PRE_BUILD.bat. It will download and unzip the art assets and install the shader builder extension for Visual Studio 2017.
-
For Linux and Mac run PRE_BUILD.command. If its the first time checking out the forge make sure the PRE_BUILD.command has the correct executable flag by running the following command chmod +x PRE_BUILD.command
It will only download and unzip required Art Assets (No plugins/extensions install).
There are the following unit tests in The Forge:
This unit test just shows a simple solar system. It is our "3D game Hello World" setup for cross-platform rendering.
This unit test shows a Julia 4D fractal running in a compute shader. In the future this test will use several compute queues at once.
This unit test shows how to generate a large number of command buffers on all platforms supported by The Forge. This unit test is based on a demo by Intel called Stardust.
This unit test shows the difference in speed between Instanced Rendering, using ExecuteIndirect with CPU update of the indirect argument buffers and using ExecuteIndirect with GPU update of the indirect argument buffers. This unit test is based on the Asteroids example by Intel.
Using ExecuteIndirect with GPU updates for the indirect argument buffers
Using ExecuteIndirect with CPU updates for the indirect argument buffers
This unit test shows the current state of our font rendering library that is based on several open-source libraries.
This unit test shows a range of game related materials:
Hair: Many years ago in 2012 / 2013, we helped AMD and Crystal Dynamics with the development of TressFX for Tomb Raider. We also wrote an article about the implementation in GPU Pro 5 and gave a few joint presentations on conferences like FMX. At the end of last year we revisited TressFX. We took the current code in the GitHub repository, changed it a bit and ported it to The Forge. It now runs on PC with DirectX 12 / Vulkan, macOS and iOS with Metal 2 and on the XBOX One. We also created a few new hair assets so that we can showcase it. Here is a screenshot of our programmer art:
Metal:
Wood:
This unit test showcases the rendering of grass with the help of hardware tessellation.
A cross-platform glTF model viewer that optimizes the vertex and index layout for the underlying platform and picks the right texture format for the underlying platform. We integrated Arseny Kapoulkine @zeuxcg excellent meshoptimizer and use the same PBR as used in the Material Playground unit test. This modelviewer can also utilize Binomials Basis Universal Texture Support as an option to load textures. Support was added to the Image class as a "new image format". So you can pick basis like you can pick DDS or KTX. For iOS / Android we go directly to ASTC because Basis doesn't support ASTC at the moment.
glTF model viewer running on iPad with 2048x1536 resolution
glTF model viewer running on Samsung Galaxy S10 with Vulkan with 1995x945 resolution
glTF model viewer running on Ubuntu AMD RX 480 with Vulkan with 1920x1080 resolution
This unit test shows various shadow and lighting techniques that can be chosen from a drop down menu. There will be more in the future.
- Exponential Shadow Map - this is based on Marco Salvi's @marcosalvi papers. This technique filters out the edge of the shadow map by approximating the shadow test using exponential function that involves three subjects: the depth value rendered by the light source, the actual depth value that is being tested against, and the constant value defined by the user to control the softness of the shadow
- Adaptive Shadow Map with Parallax Correction Cache - this is based on the article "Parallax-Corrected Cached Shadow Maps" by Pavlo Turchyn in GPU Zen 2. It adaptively chooses which light source view to be used when rendering a shadow map based on a hiearchical grid structure. The grid structure is constantly updated depending on the user's point of view and it uses caching system that only renders uncovered part of the scene. The algorithm greatly reduce shadow aliasing that is normally found in traditional shadow map due to insufficient resolution. Pavlo Turchyn's paper from GPU Pro 2 added an additional improvement by implementing multi resolution filtering, a technique that approximates larger size PCF kernel using multiple mipmaps to achieve cheap soft shadow. He also describes how he integrated a Parallax Correction Cache to Adaptive Shadow Map, an algorithm that approximates moving sun's shadow on static scene without rendering tiles of shadow map every frame. The algorithm is generally used in an open world game to approximate the simulation of day & night’s shadow cycle more realistically without too much CPU/GPU cost.
- Signed Distance Field Soft Shadow - this is based on Daniel Wright's Siggraph 2015 @EpicShaders presentation. To achieve real time SDF shadow, we store the distance to the nearest surface for every unique Meshes to a 3D volume texture atlas. The Mesh SDF is generated offline using triangle ray tracing, and half precision float 3D volume texture atlas is accurate enough to represent 3D meshes with SDF. The current implementation only supports rigid meshes and uniform transformations (non-uniform scale is not supported). An approximate cone intersection can be achieved by measuring the closest distance of a passed ray to an occluder which gives us a cheap soft shadow when using SDF.
To achieve high-performance, the playground runs on our signature rendering architecture called Triangle Visibility Buffer. The step that generates the SDF data also uses this architecture.
Click on the following screenshot to see a movie:
The following PC screenshots are taken on Windows 10 with a AMD RX550 GPU (driver 19.7.1) with a resolution of 1920x1080.
Exponential Shadow Maps:
Adaptive Shadow Map with Parallax Correction Cache
Signed Distance Field Soft Shadow:
Signed Distance Field Soft Shadows - Debug Visualization
The following shots show Signed Distance Field Soft Shadows running on iMac with a AMD RADEON Pro 580
The following shots show Signed Distance Field Soft Shadows running on XBOX One:
Readme for Signed Distance Field Soft Shadow Maps:
To generate the SDF Mesh data you should select “Signed Distance Field” as the selected shadow type in the Light and Shadow Playground. There is a button called “Generate Missing SDF” and once its clicked, it shows a progress bar that represents the remaining SDF mesh objects utilized for SDF data generation. This process is multithreaded, so the user can still move around the scene while waiting for the SDF process to be finished. This is a long process and it could consume up to 8+ hours depending on your CPU specs. To check how many SDF objects there are presently in the scene, you can mark the checkbox "Visualize SDF Geometry On The Scene".
This unit test was build by Kostas Anagnostou @KostasAAA to show how to ray trace shadows without using a ray tracing API like DXR / RTX. It should run on all GPUs (not just NVIDIA RTX GPUs) and the expectation is that it should run comparable with a DXR / RTX based version even on a NVIDIA RTX GPU. That means the users of your game do not have to buy a NVIDIA RTX GPU to enjoy HRT shadows :-)
This unit test shows reflections that are ray traced. It is an implementation of the papers Optimized pixel-projected reflections for planar reflectors and IMPLEMENTATION OF OPTIMIZED PIXEL-PROJECTED REFLECTIONS FOR PLANAR REFLECTORS
This unit test shows a typical VR Multi-GPU configuration. One eye is rendered by one GPU and the other eye by the other one.
This unit test showcases a cross-platform FileSystem C API, supporting disk-based files, memory streams, and files in zip archives. The API can be viewed in IFileSystem.h, and all of the example code has been updated to use the new API.
- The API is based around
Path
s, where eachPath
represents an absolute, canonical path string on a particular file system. You can query information about the files atPath
s, open files asFileStream
s, and copy files between differentPath
s. - The concept of
FileSystemRoot
s has been replaced byResourceDirectory
s.ResourceDirectory
s are predefined directories where resources are expected to exist, and there are convenience functions to open files in resource directories. If your resources don’t exist within the default directory for a particular resource type, you can callfsSetPathForResourceDirectory
to relocate the resource directory; see the unit tests for sample code on how to do this.
This unit test shows how the integration of imGui with a wide range of functionality.
This unit test compares various Order-Indpendent Transparency Methods. In the moment it shows:
- Alpha blended transparency
- Weighted blended Order Independent Transparency Morgan McGuire Blog Entry 2014 and Morgan McGuire Blog Entry 2015
- Weighted blended Order Independent Transparency by Volition GDC 2018 Talk
- Adaptive Order Independent Transparency with Raster Order Views paper by Intel, supports DirectX 11, 12 only, and a Primer
- Phenomenological Transparency - Diffusion, Refraction, Shadows by Morgan McGuire
This unit test shows how to use the new wave intrinsics. Supporting Windows with DirectX 12 / Vulkan, Linux with Vulkan and macOS / iOS.
The new 16_Raytracing unit test shows a simple cross-platform path tracer. On iOS this path tracer requires A11 or higher. It is meant to be used in tools in the future and doesn't run in real-time. To support the new path tracer, the Metal raytracing backend has been overhauled to use a sort-and-dispatch based approach, enabling efficient support for multiple hit groups and miss shaders. The most significant limitation for raytracing on Metal is that only tail recursion is supported, which can be worked around using larger per-ray payloads and splitting up shaders into sub-shaders after each TraceRay call; see the Metal shaders used for 16_Raytracing for an example on how this can be done.
macOS 1920x1080 AMD Pro Vega 64
iOS iPhone X 812x375
Windows 10 1080p NVIDIA RTX 2080 with DXR Driver version 441.12
Windows 10 1080p NVIDIA RTX 2080 with RTX Driver version 441.12
Linux 1080p NVIDIA RTX 2060 with RTX Driver version 435
This unit test was originally posted on ShaderToy by Inigo Quilez and Sopyer. It shows how a scene is ray marched with shadows, reflections and AO
This unit test shows how to use the high-performance entity component system in The Forge. This unit test is based on a ECS system that we developed internally for tools.
The Forge has now support for Sparse Virtual Textures on Windows and Linux with DirectX 12 / Vulkan. Sparse texture (also known as "virtual texture", “tiled texture”, or “mega-texture”) is a technique to load huge size (such as 16k x 16k or more) textures in GPU memory. It breaks an original texture down into small square or rectangular tiles to load only visible part of them.
The unit test 18_Virtual_Texture is using 7 sparse textures:
- Mercury: 8192 x 4096
- Venus: 8192 x 4096
- Earth: 8192 x 4096
- Moon: 16384 x 8192
- Mars: 8192 x 4096
- Jupiter: 4096 x 2048
- Saturn: 4096 x 4096
There is a unit test that shows a solar system where you can approach planets with Sparse Virtual Textures attached and the resolution of the texture will increase when you approach.
Linux 1080p NVIDIA RTX 2060 with RTX Driver version 435
Windows 10 1080p NVIDIA 1080 DirectX 12
Windows 10 1080p NVIDIA 1080 Vulkan
This unit test shows how to playback a clip on a rig.
This unit test shows how to blend multiple clips and play them back on a rig.
This unit test shows how to attach an object to a rig which is being posed by an animation.
This unit test shows how to blend clips having each only effect a certain portion of joints.
This unit test shows how to introduce an additive clip onto another clip and play the result on a rig.
This unit test shows how to use a scene of a physics interaction that has been baked into an animation and play it back on a rig.
This unit test shows how to animate multiple rigs simultaneously while using multi-threading for the animation updates.
This unit test shows how to use skinning with Ozz
This unit test shows how to use a Aim and a Two bone IK solvers
We integrated SoLoad. Here is a unit test that allow's you make noise ...
There is an example implementation of the Triangle Visibility Buffer as covered in various conference talks. Here is a blog entry that details the implementation in The Forge.
Below are screenshots and descriptions of some of the tools we integrated.
We integrated the Micro Profiler into our code base by replacing the proprietary UI with imGUI and simplified the usage. Now it is much more tightly and consistently integrated in our code base.
Here are screenshots of the Microprofiler running the Visibility Buffer on PC:
Here are screenshots of the Microprofiler running a unit test on iOS:
Check out the Wikipage for an explanation on how to use it.
We provide a shader translator, that translates one shader language -a superset of HLSL called Forge Shader Language (FLS) - to the target shader language of all our target platforms. That includes the console and mobile platforms as well. We expect this shader translator to be an easier to maintain solution for smaller game teams because it allows to add additional data to the shader source file with less effort. Such data could be for example a bucket classification or different shaders for different capability levels of the underlying platform, descriptor memory requirements or resource memory requirements in general, material info or just information to easier pre-compile pipelines. The actual shader compilation will be done by the native compiler of the target platform.
How to use the Shader Translator
Confetti will prepare releases when all the platforms are stable and running and push them to this GitHub repository. Up until a release, development will happen on internal servers. This is to sync up the console, mobile, macOS and PC versions of the source code.
We would appreciate it if you could send us a link in case your product uses The Forge. Here are the ones we received so far:
The Forge is used to build the StarVR One SDK:
The Forge is used as the rendering framework in Torque 3D:
SWB is an editor for the 2003 game 'Star Wars Galaxies' that can edit terrains, scenes, particles and import/export models via FBX. The editor uses an engine called 'atlas' that will be made open source in the future. It focuses on making efficient use of the new graphics APIs (with help from The-Forge!), ease-of-use and terrain rendering.
For contributions to The Forge we apply the following writing guidelines:
- We limit all code to C++ 11 by setting the Clang and other compiler flags
- We follow the [Orthodox C++ guidelines] (https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b) minus C++ 14 support (see above)
There will be a user group meeting during GDC. In case you want to organize a user group meeting in your country / town at any other point in time, we would like to support this. We could send an engineer for a talk.
In case your School / College / University uses The Forge for education, we would like to support this as well. We could send an engineer or help create material. So far the following schools use The Forge for teaching:
Breda University of Applied Sciences
Contact:
Jeremiah van Oosten
Monseigneur Hopmansstraat 1
4817 JT Breda
Contact:
Andrew Hogue
Ontario Tech University
SIRC 4th floor
2000 Simcoe St N
Oshawa, ON, L1H 7K4
The Forge utilizes the following Open-Source libraries:
- Assimp
- Fontstash
- Vectormath
- Nothings single file libs
- shaderc
- SPIRV_Cross
- TinyEXR
- Vulkan Memory Allocator
- GeometryFX
- WinPixEventRuntime
- Fluid Studios Memory Manager
- volk Metaloader for Vulkan
- gainput
- hlslparser
- imGui
- DirectX Shader Compiler
- Ozz Animation System
- Lua Scripting System
- TressFX
- Micro Profiler
- MTuner
- EASTL
- SoLoud
- meshoptimizer
- Basis Universal Texture Support
- TinyImageFormat