Nominom/BCnEncoder.NET

Avoid allocating all RawBlock4X4Rgba32 blocks to reduce memory overhead

ptasev opened this issue · 1 comments

It seems the code allocates all blocks necessary during decoding, but these blocks are only used temporarily. It would be ideal if instead of allocating separate blocks, the decoded data is written directly to the output instead of these temporary blocks.

An example could be that the blocks to process are divided up so that they are separated evenly on each thread. Each thread allocates a single RawBlock4X4Rgba32 and decodes each block serially using that temporary block. Then the contents of the raw block are written directly to the output image at the proper location.

Not sure if this is worth it, but interested to hear your thoughts.

Thanks for suggesting this.

There's a lot that could be improved in terms of using system memory and general performance and this would probably be one of the easier ones to fix. Although, I think there's always some sort of a tradeoff between writing performant code and readable code.

I'm not going to add this change in 2.0 yet, but if someone wants to create a pr with a fix that doesn't hurt readability too much, that would be greatly appreciated.

I'll probably also look into more ways to improve performance and memory usage without sacrificing readability some time after 2.0 release.