links explaining + justifying higlights are missing, register allocation description is missing
Closed this issue · 3 comments
At least any good work that I am aware of heavily links justifications for the authors opinion in the abstract.
I do understand, that the state of the work is probably very much unfinished. Nevertheless, I'd like to ask for clarification on the things that look to me like the main points (the highlights).
Array loops are implemented
These two sentences could be squashed into one: "There is no need to compile the code separately for different microprocessor versions with different vector lengths." + "No recompilation or update of software is needed when a new microprocessor with a different vector register length becomes available." making the followup paragraph superfluous.
Memory management ..
"in most cases to avoid memory paging" is too inaccurate for an abstract. I believe it depends on the case of system, since any system with the option of unbounded number of clone
(due to external hardware input) is uncheckable.
For additional inspiration of the potential future design space and somewhat realistic performance measurements, take a look at 'Efficient virtual cache coherency for multicore systems and accelerators' by Xuan Guo
.
This can prevent stack overflow in most cases
This is also too inaccurate for an abstract and not very correct, because it does not include stack space from the Kernel during interrupts etc. Take a look at the algorithm we came up with here including limitations. Also, it does not mention that it is an over-approximation of all possible control flows as an exact execution requires possible input ranges + result approximations to know which function trace might be executed.
A mechanism for optimal register allocation across program modules and function libraries is provided.
- As far as I know, the problem of register allocation (graph coloring) is NP-hard. This should be reformulated and I can not find anything on "register allocation".
This is a manual, not a report. There is no abstract, you are looking at the highlights. The highlights are intended to give a quick overview of important features, not a complete description.
Current computer systems sometimes cause heavy memory fragmentation. Much of this fragmentation can be avoided by a more careful design of software and hardware. There is a big advantage to gain if you can minimize memory fragmentation and avoid the TLB and multi-level page tables. However, there are still cases where memory fragmentation cannot be avoided. We have discussed this at length at the forum (https://www.forwardcom.info/forum/viewtopic.php?f=1&t=150 ) The highlights cannot cover the long discussion of cases where memory fragmentation can be limited and cases where it cannot.
The maximum stack size can be calculated by the linker by finding the longest branch on the three of possible function calls. However, this is possible only if there are no recursive function calls. The kernel may have its own stack.
Register allocation is a standard task for all modern compilers. Two variables can use the same register if their live ranges do not overlap. ForwardCom provides information about register use in object files and function libraries. This is described in the manual in chapter 12.5. This makes it possible for the compiler to know which registers are unchanged across a call to an external function.
The highlights cannot cover the long discussion of cases where memory fragmentation can be limited and cases where it cannot.
Understandable, thanks.
The kernel may have its own stack.
It may reuse the user stack, typically used on Kernels/OSes for embedded, see https://www.freertos.org/FreeRTOS_Support_Forum_Archive/November_2018/freertos_Which_stack_is_used_for_ISR_ARM_Cortex_M4_port_e866ad9c8cj.html. However, I think typical OS with a MMU have separate stacks.
This makes it possible for the compiler to know which registers are unchanged across a call to an external function.
Sweet. It is 11.5 Register usage convention. Thanks.
Feel free to close as I am not sure, if you think its worthwhile to mention embedded kernel influence on stack size.
Thanks for your elaborate answers and pointers where to dig more. I think I will try to look abit into microcode (motivation is this use after free hardware security vulnerability https://lock.cmpxchg8b.com/zenbleed.html) to see if I can find design flaws.
The AMD vulnerability you are linking to is the result of a fundamentally flawed ISA design. The SSE ISA was designed with 128 bit vector registers without taking future extensions into account. ForwardCom is designed to avoid such flaws by learning from past mistakes. ForwardCom avoids microcode because it is inefficient.