rosco-m68k/rosco_m68k

CPU detection does not work in a "fully-loaded" system

roscopeco opened this issue · 6 comments

The current firmware relies on at least one bus error during early initialisation in order to detect the CPU model, which is then used to determine to multiplier for speed calculation.

On r2 boards, there are three devices that have the potential to cause bus error during init - the 68681 (which should always be present), Xosera and the V9958.

Assuming the 68681 will never cause a bus error, if a Xosera board is present (thus not causing a bus error) then the V9958 init is skipped completely (removing the last opportunity for a bus error). So in this configuration, the CPU model is never detected, and the firmware chooses a (rather optimistic) default:

IMG_0178

Suggestions for a fix (UPDATE Illegal Instruction trap decided as the way to go, see discussion in comments):

  • Always generate a bus error (perhaps we can reserve a word in IO space that should always be unused, for example)
  • Stop relying on bus errors for CPU detection (we do this because the CPU stacks enough useful info for the handler)
  • Something else? Yes - probe with CPU-specific instructions & illegal instruction trap
0xTJ commented

Looking at the M68k family PRM, and some of the individual chip UMs, I'm thinking that using bus errors isn't the best solution.

If CPU detection has to be done, I think doing it by hooking the Illegal Instruction exception would be best. The PRM has a handy table of which instructions are supported on which CPU (but doesn't include the M68060), and I don't think it would be too hard to pick some to identify the CPU.

The MC68030UM suggests that code shouldn't be made to expect a specific stack frame type from specific exceptions, and that each exception handler should all be able to handle any of the formats, which are all described in the PRM.

One other option would be to drop CPU detection entirely, and make all the core code be CPU-agnostic. As far as I understand, the execution environments at boot should be equivalent between CPUs, so we don't need to know it right away. Any extra initialization (cache, MMU, FPU) can be done based on some configuration information (like a block in system flash), after the first stage of boot-up (so that you can't brick your system by telling it that it's an '040 when it's a '010).

Regardless of whether CPU detection is kept, I think it would be best to have the core of the firmware be built for '000 (no MOVEC, vector table at address 0), but able to handle exception stack frames from any CPU. To me, this feels like the most elegant solution for system firmware.

Firstly, I would like to keep CPU detection. Even though we don't specifically need it, I like that we can display it, and that the information is made available to user code in the system data block (along with computed speed).

I like the idea of using the illegal instruction trap to detect the CPU type, that does feel much cleaner than the current scheme. We still need to handle the various formats for bus error (so we can probe hardware) - but removing the current way of detecting the CPU (as a side-effect of that bus error) feels like the right way to go to me.

I'm not sure whether targeting 000 is useful, and it complicates things a bit WRT the bus errors during probing. FWIW I don't think the current firmware works with anything below 010 (though haven't specifically tested that in a while). I'd be happy to have 010 as the baseline, but I could be swayed if we think 000 support is useful enough.

0xTJ commented

Very fair wanting to keep CPU detection, it is nice to see that pop up.

Also, I had forgotten (even though I had it in front of me while writing) that the '000 doesn't use the format-code stack frame, so I absolutely agree with keeping the '010 as the least common denominator.

Cool, I think that's reasonable (and FWIW I just did a quick test and it doesn't currently work with a 68000, at least not the random one I just pulled out of a drawer - it hangs during the hardware probe).

I will have a look at switching to illegal instruction traps as you suggest, it's a much cleaner idea than the bus-error-side-effect thing (and we have plenty of room now) 👍

0xTJ commented

Looking at the instructions that exist, it looks like EXTB (among several other choices) could be used to rule out the '010, CALLM/RETM pair (with a type $00 module descriptor to avoid any external state impact) to check for '020, and MOVE16 to check for '040 or above.

0xTJ commented

And for the '060 (the M68060 user manual has a table like the one in the PRM, but with the '060 added) MOVEP has been removed, so it can be used to rule out the '060.

There are some other options, but I like this set because they're all non-supervisor instructions, which feels cleaner, and they also don't depend on features that are removed in cost-reduced versions.