Unexpected Memory Usage in sf::VertexBuffer
EichenseerMarco opened this issue · 3 comments
Prerequisite Checklist
- I searched for existing issues to prevent duplicates
- I searched for existing discussions on the forum to prevent duplicates
- I am here to report an issue and not to just ask a question or look for help (use the forum or Discord instead)
Describe your issue here
Hi everyone,
I've been experimenting with sf::VertexBuffer
and noticed some unexpected memory allocations.
I'm not sure if these actually are bugs. If not, I would appreciate some insight why the code blow behaves like this.
I haven't found another issue regarding memory management in sf::VertexBuffer
.
I haven't found a relevant forum discussion regarding the potential problems I've observed.
I have found some general info about OpenGL VBOs which says that an additional RAM copy is always required (on Windows at least) but I'm not sure if this is (still) applicable: https://community.khronos.org/t/vbo-memory-usage/16223
Best regards
Marco
Your Environment
- OS / distro / window manager: Windows 10 22H2
- SFML version: 2.6.1 (SFML-2.6.1-windows-vc17-64-bit)
- Compiler / toolchain: MSVC 17.8.3 Compiler Version 19.38.33133
- Special compiler / CMake flags: n/a
Steps to reproduce
When stepping through the code below with a debugger, observe the memory usage.
The points a), b), c) and d) could indicate problems with the memory management.
#include <SFML/Graphics.hpp>
#include <vector>
// RAM and VRAM usage numbers are rough estimates according to Task Manager
int main()
{
{ // scope to control lifetime of sf::VertexBuffer
// current RAM usage: 1 MB
sf::VertexBuffer vb(sf::VertexBuffer::Usage::Static);
// expected RAM usage: 1 MB
// actual RAM usage: 24 MB
/*
* a) is it expected that an empty VertexBuffer allocates a significant amount of RAM?
*/
{ // scope to control lifetime of std::vector
size_t testSize = 100e6; // testSize large enough to make observing allocations easy
std::vector<sf::Vertex> data;
data.resize(testSize);
// expected RAM allocation: testSize * sizeof(sf::Vertex) == 100e6 * 20 Byte == 2 GB
// actual RAM usage: 2GB
vb.create(data.size());
// expected VRAM allocation: 2 GB (reason: documentaion says "[...] allocates enough graphics memory to hold vertexCount vertices")
// actual VRAM usage: 0 GB
/*
* b) is it expected to not see any RAM nor VRAM allocations when calling sf::VertexBuffer::create()?
*/
vb.update(data.data());
// expected RAM & VRAM allocations: 0 GB (reason: copy to previously allocated buffer should not increase memory footprint)
// actual RAM usage: 5 GB
// acutal VRAM usage: 2 GB
/*
* c) is it expected to see a +3 GB RAM and a +2 GB VRAM allocations when calling sf::VertexBuffer::update()?
* Seeing 2 GB VRAM eventually is expected (2 GB of vertex data in VRAM) but I would have expected this after create()
* Seeing 3 GB RAM is unexpected for two reasons: Seemingly no copy to RAM is performed, data size is only 2 GB
*/
} // vector goes out of scope, RAM usage drops to 3 GB which matches additional RAM usage from point c)
} // vertex buffer goes out of scope, RAM usage drops to 1GB
/*
* d) is it expected that 1 GB of RAM remains after cleaning up sf::VertexBuffer?
*/
return 0;
}
Note that the example above uses a std:::vector<sf::Vertex>
because the API does not seem to support updating a sf::VertexBuffer
using a sf::VertexArray
.
Is updating a sf::VertexBuffer
using a sf::VertexArray
indeed not supported?
void usage()
{
sf::VertexArray va;
// fill...
sf::VertexBuffer vb;
vb.create(va.getVertexCount());
// how to fill vb using va? There seems to be no .data() or equivalent to access underlying std::vector
// vb.update(va.?);
vb.update(&va[0]);
// this should work because &vector[0] == vector.data() but it looks like the API does not support this use case
}
Expected behavior
I commented in the example where exactly the unexpected memory allocation can be observed.
Summary of expected behavior:
- a) Constructing empty
sf::VertexBuffer
does not allocate RAM. - b)
sf::VertexBuffer::create()
allocates requested amount of VRAM (only VRAM) - c)
sf::VertexBuffer::update()
copies data to VRAM, does not allocate any additional RAM nor VRAM. - d) Destroying
sf::VertexBuffer
andsf::Vertex
data used to fill it reduces RAM and VRAM footprint to initial size.
Actual behavior
I commented in the example where exactly the unexpected memory allocation can be observed.
Summary of actual behavior:
- a) Constructing empty
sf::VertexBuffer
allocates a significant amount of RAM. - b)
sf::VertexBuffer::create()
does not seem to immediately allocate VRAM. - c)
sf::VertexBuffer::update()
allocates expected amount of VRAM but also RAM and the RAM allocation is significantly larger than data being copied. - d) Destructing
sf::VertexBuffer
does not reduce RAM footprint to initial size.
I'm aware that freeing / deleting / destroying objects may not (immediately) return the memory to the OS making the values from Task Manager unreliable (references below) but both containers going out of scope in the example below show an immediate reduction in RAM/VRAM usage. The "significantly larger than data" RAM allocation remaining seems strange.
Ref: https://stackoverflow.com/a/52417370
Ref: https://stackoverflow.com/q/17008180
Is updating a sf::VertexBuffer using a sf::VertexArray indeed not supported?
I would hazard a guess that the reason this is unsupported is because asf::VertexArray
exists as its own type of drawable and is an alternative option to asf::VertexBuffer
. It can be used withwindow.draw()
calls, as cansf::VertexBuffer
. I don't think it's expected to be able to update one from the other in this case. There has been discussions in the past about removingsf::VertexArray
entirely in place of astd::vector<sf::Vertex>
.
I would guess that the reason the GPU memory usage doesn't increase until the update()
call is because upon create SFML passes NULL
as the argument for the data
parameter when calling glBufferData
. As noted in the docs:
If data is not NULL, the data store is initialized with data from this pointer.
If data is NULL, a data store of the specified size is still created, but its contents remain uninitialized and thus undefined.
But upon the call to update()
a valid set of data is uploaded to the GPU and thus the allocation finally appears. Maybe the driver or OS treats allocated but uninitialised memory as unused until an actual use is encountered.
Regarding the memory consumption, it could be at the OS level, it coulde be graphics driver related (leaky drivers have been a regular concern of SFML users that spot increased consumption or notice reports from the CRT that a memory leak occured). I took a look using the heap profling tools available within VS2022 and noticed nothing too out of the ordinary. Below is a screenshot of the heap allocations graphed:
I took various snapshots through the code:
- At the start prior to creating anything
- After creating the
sf::VertexBuffer
- After creating the
std::vector
- At the end after both local scopes have been exited.
The heap profiler reports the following contents remaining at stage 4:
When analysing the char[]
contents we see:
Which are similar to the reports the CRT memory leak detection finds, appearing as a leaky driver to me here.
The only thing SFML related I can definitely notice is the results for std::basic_string
seem to show
which appears to be tied back to https://github.com/SFML/SFML/blob/2.6.x/src/SFML/Window/GlContext.cpp#L265
The container proxy results also tie back to the same places as std::basic_string
. Likely because of the use of an STL container within that code. I'm not confirming that this extensionString case is a leak, but it may be the aspect worth investigating for any interested parties.
Edit: Having looked into the extensionString logic it's pushed back into a static vector
, which won't be destroyed until exit. This explains the std::basic_string
and container_proxy
results as they will be released at the point of exit. I think what you've reported is likely expected behaviour.
You need to understand that SFML handles a lot of hidden states when dealing with objects in the graphics modules. The first such object instance will create a shared context and a bunch of other things, so it's expected to see some more RAM usage when creating the first graphics object.
Additionally, OpenGL is a queued/"async" protocol, just because SFML has made the specific call, doesn't guarantee that it's executed right the instance, it can also be executed sometime later. Hard to say whether this effect plays any role in your measurements.
- b)
sf::VertexBuffer::create()
does not seem to immediately allocate VRAM.
As Bambo already noticed, I guess the GPU is still informed of the up coming usage, but the data isn't actually yet initialized. Maybe that's more of a performance optimization and maybe the documentation could be aligned slightly.
How did you measure VRAM and RAM usage?
Is updating a
sf::VertexBuffer
using asf::VertexArray
indeed not supported?
No, it's not supported, as they server different purposes. If you just want an "array" of vertices, you should use a std::vector<sf::Vertex>
.
- c)
sf::VertexBuffer::update()
allocates expected amount of VRAM but also RAM and the RAM allocation is significantly larger than data being copied.
We're just passing the pointer to OpenGL, whether the driver creates a copy or not, isn't really in our power.
Hi,
thanks a lot for looking into this.
@Bambo-Borris
I can't seem to get such detailed info from Visual Studio's memory profiler but it looks like heap size actually does shrink at "stage 4" to expected levels. The Heap Size columns shows 17 MB but the graph still shows 1 GB. Those 17 MB are very roughly what sf::VertexBuffer
allocated initially which eXpl0it3r explained are part of the hidden state.
@eXpl0it3r
Thanks for clarifying the point about SFML managing hidden state when encountering something from the graphics module for the first time.
How did you measure VRAM and RAM usage?
Mainly via Task Manager, the large test size made it easy to spot if/when memory is allocated / data is moved to the GPU.
I also tried Visual Studio's Memory and GPU profiler but I'm not very familiar with the Memory Profiler and the GPU profiler doesn't seem to support this use case.
We're just passing the pointer to OpenGL, whether the driver creates a copy or not, isn't really in our power.
That matches what I've found that you may not be able to force OpenGL to not keep a RAM copy as well.
If the posted code doesn't show any suspicious memory allocations SFML can control, I'd say this can be closed as "not an issue".
PS.:
Also thank you guys for clarifying whether updating a sf:::VertexBuffer
with a sf::VertexArray
is supposed to work.
I guess the idea why I think it should comes from my limited experience with Qt where its generally a good idea to use their containers (QString, QVector, ...) if possible.
Best regards & happy new year
Marco