buserror/simavr

Program Counter breaks after Interrupt Request

magdutra opened this issue · 9 comments

I'm writing a new code in assembly (gnu as compiler) for the ATmega644p and using gnu-avr and simavr for debugging.

A few instructions after initializing Timer0 and enabling interrupts with "sei", the program counter resets to the begining of the routine instead of proceeding to the next instruction (this same routine is called correctly 3 times before that).

The problem occurs exactly at the same point every time, but only if interrupts are enabled with "sei". It also doesn't matter the contents of the interrupt handler; I tried putting a "reti" directly in the interrupt vector, but the problem persisted.

Forgive me if this is an error of my own, but I spent quite a number of hours seeking a solution, to no avail.

That sounds very odd. Does it happen without using gdb? Can you post example code for reproduction?

Thank you for your reply! Unfortunately, I don't know how to run code in the simulator without using gdb.

However, I've noticed that the program counter is being loaded in the stack right before its value changes. I imagine this is the simulator triggering the interrupt, but pointing the PC to the wrong address, maybe?

I've tried the same test using simulavr for comparison, and I get a similar behavior, so I think the odds are that this is my mistake, instead of a bug.

If you'd like, I can post some example code. I can also close the case for now, until I get a better understanding of what's happening.

Thank you again.

Thank you for your suggestion. I will do a recompilation of simavr to be able to make this test with avr-run.

For now, here goes a code dump with comments modified by me for improved readability. This should be more convenient than providing a few source files, I suppose:

https://github.com/buserror/simavr/files/6289368/dump.txt

A few comments about what I'm trying to achieve: I'm implementing something similar to a __cdecl calling convention, i.e., instead of directly calling my functions with "rcall" or "call", I'm using a macro that pushes parameters in the stack and then calls a fixed address at 0x0040, where I have dispatcher function in place. When it returns, the macro fixes the stack by adding to it the number of bytes previously pushed.

The role of the dispatcher function is to look up a table containing the addresses of each public function and then call the one requested using the parameters provided on the stack (the first two parameters are the module index and function index to be invoked).

The first "module" that I implemented is the avr's Timer0, with two functions: one to initialize it (TmrInit) and another one that causes a delay between 1ns and 6,5s (TmrWait).

To test all this, I created a second module that makes a led blink every 0,5s using TmrWait and the dispatcher. Everything works, as long as interrupts are disabled. If the Global Interrupt Flag is enabled ("sei"), then a few moments later the program counter is redirected to the address 0x0040.

Well, I did it! And the instruction tracing is absolutely marvelous! Now I have a detailed description of what I've been trying to describe.

The corruption is occurring at the end of the "GetVector" routine (0x76 to 0x86). This block is called two times before the "sei" instruction, then two more times after it. However, in the second round (see line 242) the program counter jumps to the address 0x0040 and starts processing 'push' instructions from the beginning of the "SysCall" routine.

What could explain such behavior?

https://github.com/buserror/simavr/files/6289494/trace.txt

That made an entertaining puzzle. I am almost certain that it is a simavr bug, this line in simavr/simavr/cores/sim_megax4.h:

DEFAULT_CORE(4),

That defines a processor with 4-byte interrupt vectors, but the data sheet and your code say 2. So when the interrupt happens, the wrong vector address is used. It is interesting that you saw the same thing in another simulator. Has it the same bug?

One thing I noticed while looking at this: your code loads the SP with interrupts enabled. I believe that is not safe.

Just to give you confirmation, I applied your correction and now it is working like a charm.
Thank you very much for your help!

Sadly, that is the wrong fix. The right one is for you to insert a nop after each instruction in the interrupt vector area. I was caught out by the fact that simavr uses byte addressing for flash, but in the hardware the addresses are words.

You don't know what a relief is to learn that. The code was working in the simulator, but not in the actual device. I've undone the patch now and applied your new suggestion. It's working on both now!

The problem is that the datasheet is very confusing in this regard: the item 9.2, shows an interrupt table with "Program Address" going from 0x0000, 0x0002, 0x0004 etc. (i.e., 2 bytes) while at the same time showing a "jmp" instruction, which uses four bytes.

I'm used to working with Tiny AVRs, but it is the first time I'm using a Mega, so I trusted my gut and replaced the "jmp" for "rjmp". That caused all this confusion. Sorry about that.