Bug in UM/MOD?

Question

Bug in UM/MOD?

MrMarkB opened this issue 4 years ago · 16 comments

I've found an apparent bug in the UM/MOD word.

Test case producing error is "HEX 2400 F4 A2C3 UM/MOD .S" which gives result "22C2 2" where "A243 17F" is expected.

For context I'm developing a word that configures the Timer/Counter 2 to produce a 50% duty cycle PWM at a specified frequency 1-64k Hz. The error occurs in the calculation of the TIM2_ARR value as "16 MHz / request_frequency". It appears to be manifest for request frequencies greater than or equal to 41666 Hz (A2C3) and not for lower request frequencies.

Answer 1 · 2021-02-02T14:50:45.000Z

Neglected to add, error was seen on version 2.2.26 STM8S103F3 pre-built load running on a MinDev board.

Answer 2 · 2021-02-02T18:53:00.000Z

Context: definition of UM/MOD:

;       UM/MOD  ( udl udh un -- ur uq )
;       Unsigned divide of a double by a
;       single. Return mod and quotient.

the value of un is $A2C3 which, if n were the parameter, would be a negative number (b15=1). Using half this value (b15=0) gives a correct result:

HEX 2400 F4 5161 UM/MOD .S
 100 300 <sp  ok

I'm not sure if this is a limitation of the original eForth or a bug that I introduced. I will look into it.

Answer 3 · 2021-02-02T20:21:28.000Z

I checked it: that's a limitation (alias "bug") in the original STM8EF code.

Here is the minimal test that shows when the divisor gets too large:

HEX
0 2 7FFF UM/MOD . drop 4 ok
0 2 8000 UM/MOD . drop 4 ok
0 2 8001 UM/MOD . drop 0 ok

I'll check if I can fix it.

Answer 4 · 2021-02-04T14:14:17.000Z

I've attached a patched version of the UM/MOD code that I believe gets the expected results, albeit with limited testing.

The root issue is that the original algorithm does a comparison of the divisor (YTEMP) and the upper 16 bits of the shifted dividend (X) without considering the "17th bit" that was shifted out of X. I've added a check for carry out and rearranged the logic a bit with the view of minimizing compiled executable size.

This seems to be working on a MINDEV board and in sstm8 simulator with a limited number of test cases.

umMod.txt

Answer 5 · 2021-02-04T16:45:14.000Z

@MrMarkB thanks, that's of great value! I had started to analyze what's going wrong but couldn't find the time to apply a simulator, debugger or instrumentation. I'll include it in the source and run some tests cases.

Would you like to fork, patch and make a pull request? That way you can leave a "mark" in the code :-)

Answer 6 · 2021-02-06T17:40:29.000Z

replaced UM/MOD in Picatout/stm8_eForth and tested "HEX 2400 F4 A2C3 UM/MOD .S"
the result is: "2243 17F" The most significant bit of remainder is missing.

Answer 7 · 2021-02-07T01:44:04.000Z

bug corrected. In you solution you used SRAW X at MSMMb: but RRCW X must be used here to gain most significant bit of remainder.

Answer 8 · 2021-02-07T07:58:27.000Z

@Picatout merci mille fois!

I noticed that you used DIVW X,Y for the 16bit/16bit case. I've long had ideas about using DIVW X,Y for a 32bit/16bit UM/MOD using this approach. This has never been a priority. Would it in your opinion be worth it?

Answer 9 · 2021-02-07T08:16:59.000Z

In my opinion any 16, 32 or 64 bit unsigned math routines are gold. The few routines I wrote in high level Forth to calculate values to load in a DDS (Direct Digital Synthesizer) look so cumbersome it's embarrassing. Back when I lived in F-PC Mr Robert Smith wrote such a piece of code in assembler for the 8086 family. That and the references might help someone but I could never figure it out. Kind regards Richard

Answer 10 · 2021-02-07T14:02:04.000Z

@TG9541 ,
When I first adapted Mr thing code for NUCLEO-STM8S208 I changed the code to use divw x,y in the case where udh is 0. But for the other cases I left it has the original. After analyzing the problem I came to conclusion that there was no advantage of using divw x,y in case where udh is not 0. This would require to much transfer between memory and registers. After the first division of udh/un it would be required to count the leading zero in Y (the remainder) subtract that number from 17 then save the partial quotient, put the remainder in X put udl in Y shift left X:Y (17-lzb)
save Y again put un in Y again to do the next DIVW X,Y. The second part of quotient must be right appended to the first part. And if there is bits left in udl the process must be repeated. This require to much step to be advantageous.
The bigger the remainder the more loop would be required, because less bits are shifted at each loop.
update:
giving more thought and testing I come to the conclusion it is not possible using DIVW X,Y. For example if udh==0xfffe and un==0xffff DIVW X,Y would result in q=0 r=0xfffe then the next try would be to shift the dividend 1 bit left. But then the most significant bit of dividend would be in the carry flag and would not be considered by the next DIVW X,Y. So the only way to go is the actual bit by bit shift with subtraction.

Answer 11 · 2021-02-08T16:06:42.000Z

I'm a novice with github, so I apologize if I've made a mess of it, but I've tried to submit a pull request with the patches to UM/MOD as per posts by Picatout and myself above. This should resolve issues 400 and 401 with the UM/MOD word.

I've put together a Python script to exercise UM/MOD from the serial port with random numbers and check the result. With the patched build it hasn't called out any errors thus far.

Answer 12 · 2021-02-09T05:54:38.000Z

@MrMarkB, you did pretty well for a first time pull-request!

I'm very happy about the contribution from both you and @Picatout! In hindsight more than one user (including I) had evidence that something is not quite right with UM/MOD. Maybe it's because signed operations rescued some of the defects through stripping bit15 of the divisor, and therefore no one ever looked closely enough. Furthermore, edge cases were never properly identitfied or tested (e.g. #401). Quite obviously, the art of testing numerical core routines needs more practitioners, so thanks a lot for your contributions!

@MrMarkB would you mind publishing you test code, maybe in a GitHub Gist?

Answer 13 · 2021-02-13T06:54:29.000Z

@MrMarkB, @Picatout: this issue led to an interesting discussion in the Forth2020 FaceBook group: Bill Ragsdale remembered releasing a similar bug into the wild many years ago (which may well be the grandfather of these bugs here), and other stories about hunting down quite similar bugs emerged.

Answer 14 · 2021-02-15T01:37:50.000Z

@MrMarkB would you mind publishing you test code, maybe in a GitHub Gist?

@TG9541 If I've done this correctly, this is the Python3 code I used to exercise the um/mod word after making the patch.

The check is in the function "checkUmMod". It takes three 16 bit integers as input, formats them into string for the STM8 eForth console to push them on the stack and execute um/mod and pop the quotient and remainder off the stack. The console output is parsed to get the quotient and remainder the values of which are checked against Python's calculation of the same operation.

The main code simply generates random numbers in the range 0-FFFF for the parameters, calls the checkUmMod function, and prints the result if the result is not as expected.

https://gist.github.com/MrMarkB/da74e3d4f475ce60ccae4cff7c3bd2e9

Answer 15 · 2021-02-15T06:44:01.000Z

@MrMarkB thanks a lot!

The Gist URL somehow isn't valid - don't know why, since this one, a fresh copy, doesn't look any different:

https://gist.github.com/MrMarkB/da74e3d4f475ce60ccae4cff7c3bd2e9

Answer 16 · 2021-02-21T08:54:10.000Z

@Picatout you might find @JeeeK 's contribution interesting