microsoft/graphics-driver-samples

VideoCore IV shader compiler optimization

hideyukn88 opened this issue · 1 comments

This is bucket work item has to be breakdown to smaller real items once direction is finalized, and forked "optimized_compiler" branch to be used for work.

As many might noticed, today’s shader compiler is not really “compiler”, but more like just “translator” from HLSL to QPU instructions for quick ramp up and scoped purpose, thus outcome is not really optimized to VC4/QPU architecture.

At "optimized_compiler" branch, we can look into …

• High IR or possible LLVM
• Architecture specific IR
• Dynamic register (accumulator/register file) assignment
• Dependency based dead code removal and reordering
• Optimized to add/mul simultaneous pipeline
• Reduce duplicated constant access
• and more.

And all contributor are welcomed to have discussion/proposal here.

The VC4 QPU ISA is hard as hell to optimize