Fast/Approximate math function section
Opened this issue · 2 comments
It would be better to have faster/approximate math functions section in PAL, which is specially designed for running math kernel on Epiphany with selectable precision.
For example,
- Full precision/Low performance(reference) : e.g.,
expf()
from newlib. - Middle precision/Middle performance(20 ~ 100 clocks. general use) : e.g,
e_approx_exp()
- Low precision/Faster performance(~20 clocks. DSP use): e.g,
e_fast_exp()
- Short vectorized version for Middle and Low precision.
I think this design is much applicable for actual application programmers.
And here's validated approximate exp() implementation for Epiphany.
https://github.com/syoyo/parallella-playground/blob/master/math_exp/e_fast_exp.c
25 ~ 74 clocks for each exp(x)
evaluation within 1.0e-5 relative error.
I appreciate what you are saying here. My concern would be that the library would become too big too quickly.
-Difference in precision between low,med.high precision examples?
-Difference in code size for your exp for low,med, high?
-The newlib implementation is far too large to be practical for most embedded devices. Can it be reduced to something practical in terms of code size while keeping the precision?
-The newlib implementation is far too large to be practical for most embedded devices. Can it be reduced to something practical in terms of code size while keeping the precision?
There's a faster(non SW-based), formally verified, full precision, correctly rounded math library R&D, for example:
- crlibm http://lipforge.ens-lyon.fr/www/crlibm/
- From CRLibm to Metalibm : assisting the production of high-performance proven floating-point code
- Accurate Math Functions on the Intel IA-32 Architecture: A Performance-Driven Design
But I'm not sure this could be implemented on Epiphany since Epiphany lacks some IEEE 754 features(although FMA helps a lot). And also, it may require a lot of engineering&research resources to port such a math function to Epiphany.