`calloc` and `realloc` need implementing properly
Closed this issue · 1 comments
With latest changes, calloc
is a super-simple implementation that works, but is inefficient and slow.
realloc
is implemented "bare-minimum to spec" (i.e. it always fails, returning NULL
).
Both of these need implementing properly (in libs/src/shmall/rosco_m68k.cpp
).
Annoyingly, it's really hard (I haven't been able to, even following instructions on how to convince GCC to do that, which I assume worked in an older version) to get current GCC to choose to emit the DBcc
instruction to take advantage of the MC68010 loop mode. I've been working on this and have had to use inline assembly (just the 2 instructions that form the loop, defaulting to a simple pure C implementation when __m68k__
isn't defined) to get good performance on the memory clear and copy for these two functions.
When I use the __asm__
, I get a ~7x performance increase for calloc
in a benchmark added to malloc-test
. (~45 seconds down to ~7 seconds for 1024 calloc
calls with a size of 0x8000)
EDIT: Actually those time number might be significantly wrong (I've played around with some of the numbers), I'll have to measure again, but it is a similar significant ratio.