rosco-m68k/rosco_m68k

`calloc` and `realloc` need implementing properly

Closed this issue · 1 comments

With latest changes, calloc is a super-simple implementation that works, but is inefficient and slow.

realloc is implemented "bare-minimum to spec" (i.e. it always fails, returning NULL).

Both of these need implementing properly (in libs/src/shmall/rosco_m68k.cpp).

0xTJ commented

Annoyingly, it's really hard (I haven't been able to, even following instructions on how to convince GCC to do that, which I assume worked in an older version) to get current GCC to choose to emit the DBcc instruction to take advantage of the MC68010 loop mode. I've been working on this and have had to use inline assembly (just the 2 instructions that form the loop, defaulting to a simple pure C implementation when __m68k__ isn't defined) to get good performance on the memory clear and copy for these two functions.

When I use the __asm__, I get a ~7x performance increase for calloc in a benchmark added to malloc-test. (~45 seconds down to ~7 seconds for 1024 calloc calls with a size of 0x8000)

EDIT: Actually those time number might be significantly wrong (I've played around with some of the numbers), I'll have to measure again, but it is a similar significant ratio.