YashasSamaga/pawn-array-view

allow LICM optimization for multi-dimensional arrays

Opened this issue · 0 comments

int main() {
    pawn::array_view<int, 3> array((cell*)0x10000);
    for(int i = 0; i < 100; i++) {
        for(int j = 0; j < 100; j++) {
            for(int k = 0; k < 128; k++) {
                array[i][j][k] = 10;
            }
        }
    }
    return 0;
}

GCC 8.3 -O3 -m32

main:
        push    ebp
        mov     ebp, 65536
        push    edi
        push    esi
        push    ebx
        sub     esp, 4
        mov     DWORD PTR [esp], 0
.L4:
        xor     esi, esi
.L3:
        lea     edi, [0+esi*4]
        xor     edx, edx
.L2:
        mov     eax, DWORD PTR [ebp+0]
        lea     ecx, [edx+esi]
        add     edx, 1
        shr     eax, 2
        add     eax, DWORD PTR [esp]
        sal     eax, 2
        mov     ebx, DWORD PTR [edi+65536+eax]
        shr     ebx, 2
        add     ecx, ebx
        mov     DWORD PTR [eax+65536+ecx*4], 10
        cmp     edx, 128
        jne     .L2
        add     esi, 1
        cmp     esi, 100
        jne     .L3
        add     DWORD PTR [esp], 1
        mov     eax, DWORD PTR [esp]
        add     ebp, 4
        cmp     eax, 100
        jne     .L4
        add     esp, 4
        xor     eax, eax
        pop     ebx
        pop     esi
        pop     edi
        pop     ebp
        ret

The compiler does not hoist the array[i] and (array[i])[j] computation out of their corresponding loops. This is required since the mutations to array could modify the indirection table which could changing the addresses of array[i] and array[i][j], hence requiring recomputation.

However, modifications to the indirection table is rare. The common case does not modify the indirection table and hence subarray addresses never change. In this case, the subarray address computation should be moved out of the loop.

Note: The optimizations do happen when the array is not modified irrespective of whether the array is const qualified or not.