SUGGESTION: Improved (?) Support for (More) Modern Video Options
Closed this issue · 5 comments
Came here from Adrian's Digital Basement, specifically the episode on the second channel discussing an update on his hardcards and on the XTRAMTEST utility. Whilst watching, I had a couple ideas that I thought merited mentioning, in the hopes of eventually serving to improve XTRAMTEST usability. Forgive me, please, a few paragraphs of background. I'm naturally long-winded, but I do my best...
In the video, Adrian provides a quite accessible, but also quite technical rundown of how VGA cards interface with BIOS-based legacy/"retro" PC/XT/AT clones and compatibles. (As an aside, significant credit to him for this, Adrian has a truly remarkable talent for such explanations, and it is very much a non-trivial task.) The key thing from this explanation is that the PC requires the stack pointer to function, and that requires at least the base 16K of RAM. There are also two additional key facts about those cards that are important here, and it's kind of curious that Adrian omitted this information from his explanation; however, he may not have realized their relevance at the time.
One fact is that all standard PC VGA cards (and almost all much later cards, up to modern "gextremelyaming" and "workstation" cards that are involved with modern, "bleeding edge" current-tech systems with UEFI, long past anything XTRAMTEST will ever be used with) all support a set of very basic, very standard video output modes, with a relatively standardized interface (comparatively speaking) for dealing with them -- the so-called VESA graphics modes.
The other fact, which is somewhat more crucial, is that the initial boot screen of all BIOS-based PCs (UEFI introduced some... rather funky stuff, but, again, we need not worry about that) is NOT in any sort of VGA mode, at least, not technically. It's actually EGA. At worst, the video card is emulating EGA via VGA hardware; however, the important fact is, at that point in time, it is still acting as if it were an EGA card. That legacy mode, by the way, is conveniently one of the VESA modes :3 and this is extremely handy.
One final piece of useful information, which does not directly concern video cards, but rather the PC/XT/AT system proper, but is likewise of value. Those systems were all meant to run MS-DOS or some close variant thereof, and MS-DOS (along with all extant variants that are compatible with the sorts of systems we care about) accesses hardware directly -- unlike certain later operating systems, which require a virtualization layer to turn tricks which salesmen get quite giddy about ;3 but that all too often really mostly result in additional points of failure.
That reliance on direct hardware access, in such older systems, means that all of those systems have certain hardware at certain known addresses. For example, the "beep speaker", which is such a ubiquitous piece of computer hardware that, to this day, many if not most people of at least mild technical inclination will instantly recognize a 2-1/4in diameter round speaker component rated for eight ohms' voice coil impedance and a quarter-watt maximum output as a "PC speaker".
All of this, of course, is preaching to the choir, at the least ;3 but there's often a surprising amount of value in pointing out the obvious.
So, finally, the suggestion. Suppose that XTRAMTEST were to have some sort of ability to test for a video interface that wasn't MDA/CGA (presumably, trigger on failure to relocate the stack pointer to a useful portion of VRAM in the expected area for such cards), and that upon being triggered, this would produce some sort of audio cue through the PC Speaker, and then run a set of tests on the first 16K of RAM that were at least "good enough", for the sake of expediency, to determine whether that area of memory was stable enough to attempt further progress. Upon failure, some sort of alarm tone through the speaker. Upon success, attempt to bring up the video interface in legacy EGA mode, just like the PC would if attempting to boot with a vaguely modern BIOS interface -- and which, it's safe to assume, most VGA cards even of the relevant era would be compatible with. If that failed, attempt bringing up the card by jumping to $C800 and beginning VGA BIOS execution; if succeeded, prompt for optional VGA BIOS execution with a timeout and then continue on.
Of course, all of this assumes there's space on the ROM for this. There might not be. I'm not a programmer -- I'm an artist, a budding writer, an electronics hobbyist, a retrotech nerd, a PC nerd, and many other things :3 but I am NOT a programmer, by nature.
Hope this helps :3 and hey, thanks for doing something really cool for the retrotech community and making this a freely accessible tool for anyone who might need it and can afford a ROM programmer.
What @starhawk64 is suggesting is actually how Landmark and other full-system diag roms and diag cards of the 80s actually worked. They tested the first 16k and aborted with beep codes if it wasn't reliable, or put the stack at 0:3FFE if it passed and then kept going.
However, the point of xtramtest is to test all of system memory using march-u, which is a "whole ram" test, so there can't be exceptions, and this strategy of walling off the first 16K won't work (a banking problem could mirror the first 16k somewhere higher, and then you'd trash the stack at 16k when you tested at 80k or something). So, the use of video memory is a requirement for this specific use case.
That said, xtramtest should be updated to attempt support of VGA cards. Many (most?) VGA cards won't work until they are set up by the card's BIOS at init time, but as Adrian mention in the video, xtramtest will be executed first and those cards' BIOSes won't have the opportunity to init the modes.
To solve this chicken-and-egg problem, we can borrow the behavior of old diag ROMs to init the card:
- Perform a reliability test on the first 2k (doesn't have to be full march-u). Play error beeps if that fails, otherwise continue:
- Set the stack at 0:7FE. CALL C000:3 to initialize the video BIOS (99% of them are located at C000). Execute INT 10h to set 80x25 color.
- Now you have your video mode with RAM at B800:0, and you can continue with normal function of xtramtest with the "CGA card installed" code path.
Hope this helps. If someone implements this, I wouldn't mind a tiny credit in the source code ;-)
@MobyGamer I wasn't actually thinking of "walling off" the first 16k. I apologize if it sounded that way. I'm autistic (Asperger's), and unintentional ambiguous context is one of my biggest enemies, and simultaneously one of the hardest to dodge, let alone eliminate!
The idea was, test the first 16K, then bring up the system as specified. Once the VGA card is initialized, "wall off" a portion of VRAM and force the VGA stack pointer to run in the card's own memory; I'd think the VGA BIOS would contain the necessary addressing info within, somewhere ! Then march merrily along, testing the whole system.
Alternately, run enough of a check on the entire system RAM to ensure enough sanity that the stack pointer "probably" won't clobber later on, then proceed with the first 16k "walled off". But it would be better to force the VGA card to use its own internal memory once initialized.
I wonder... since there are those VESA modes, and at least enough standardization that 99+% of cards support legacy EGA mode via that... would it be possible to init a VGA card as if it were EGA and then use its internal VRAM? Kind of a universal driver kind of a thing.
@starhawk64 You can't consider VESA modes, because a lot of the time people will be testing with ISA cards made before 1994 which have no VESA in BIOS. My proposed solution will work with all VGA cards.
Thank you for the discussion.
As I see it, the objective of the proposals is to allow testing when the user only has EGA/VGA or other more modern video card, and to do it as reliably and in as many situations as possible.
There are two primary challenges to this, as I understand the situation.
Code space by the way is probably no issue here... even keeping to an 8K rom there is over 5K free at the moment.
Specific benefits and constraints of the March algorithm
First, I want to explain why we are using the March algorithm, and what it needs to do its job, because it's the requirements of March that we have to meet for XTRAMTEST to be relevant.
Our enthusiasm for March has nothing to do with my ego. I did not invent March or any of its variants, nor did I do any of the academic work to prove its benefits. Adrian came to me at one point asking for help writing a better RAM test for the Z80 and in particular the TRS-80 models 1 and 3, and I found March and implemented it. Since then I've also written 6502 and now 8086/8088 versions.
The reason Adrian keeps talking about the March algorithm and why it's so much better at catching common RAM problems that other, especially older RAM tests miss is because it takes into account the structure of RAM chips internally, specifically the way address decoding logic in the chips can fail. Simple bit pattern tests, even the so-called "walking ones" tests do not.
("Walking ones" tests, incidentally, have absolutely nothing to do with the March algorithm, despite the similarity in the name. The former test "walks" a single 1 bit around inside the byte cell, while the March algorithm "marches" up and down the entire memory bank looking for memory corruption.)
Adrian has explained this in a few different videos, but it's inherently complex and unintuitive, so I don't blame people for not understanding.
A very common (possibly the very most common) failure mode for RAM chips is what is generally called page faults. There are many kinds, and the academic literature on March testing explains them all, but the short version is that writes to (or even reads from) one location in a RAM chip can cause corruption to one or more other locations in the same chip. The different locations may be logically related or may not, and both whether and where it happens may depend on the actual values read or written.
(An aside: in most RAM chips of the XT era, which are 1 bit wide, the value of the byte read or written is not relevant... it's the value of the bit at each location. With 1-bit wide RAM, you're testing one bit of RAM at a time, but in 8 different RAM chips at the same time. So all of these AAh and 55h bit patterns being written over and over to a single location without checking other locations in RAM are nearly pointless. What March does differently is write to and read from RAM while "marching" around the address space in a pattern that is designed to catch all of the known kinds of page errors that don't depend on more than one sequence of reads/writes to trigger the fault. Fortunately, faults that require multiple sequences are rare "enough".)
For a March test to do its job, it must have uninterrupted access to all addresses in a bank of RAM. It must be able to "march" from the very bottom of each bank of RAM to the very top, and back down again, destroying the contents and without anything else touching the RAM chip(s) during its test. In the XT, where we may not know the exact size of each RAM bank, this means we must march from physical address zero all the way to the top of RAM, and back down again.
If we can't include the addresses at the very start of RAM, the March algorithm can't prove the RAM chip is OK. It can neither find errors where reading/writing to those first addresses causes corruption there or elsewhere, nor that reading/writing elsewhere causes corruption at the very start of RAM. Unfortunately, these two are very common failure modes, and a couple of the most important ones for the RAM test to find and report on, so I consider it a non-starter to take any approach that doesn't test all of conventional RAM, from address zero up to as much (conventional) RAM as the machine contains.
The Ganssle test
Unlike earlier March-based RAM tests I've written, I also added a bit-pattern test to the XTRAMTEST code. Very briefly, the intent of the test is to act on comments made by Jack Ganssle (not to me, but to the industry in general) about how marginal RAM chips can sometimes fail only under certain challenging sequences of accesses, specifically when address and/or data lines rapidly switch between high and low. So when XTRAMTEST runs the bit pattern test I've called the "Ganssle" test, it's not so much testing whether an AAh, 55h, FFh, or 00h causes failure at a particular address, but whether switching rapidly back and forth between complementary address and/or data on the bus will cause detectable failures in weak or intermittent chips.
Boot time video mode
@starhawk64 mentioned EGA and video boot modes. I don't know if what you say is accurate or not, but I have skepticism. Conventional fixed-frequency VGA monitors simply cannot sync to EGA frequencies. That doesn't necessarily mean you can't have 350-line modes on VGA monitors (I suspect you can), but those are not running at EGA timings.
But for the purposes of XTRAMTEST, I don't think it matters: all we care about is if we can get control of the machine with the video hardware initialized to a mode where we can draw text just by writing to the video memory. If we can, the same code can draw text in MDA,CGA,EGA, and VGA.
Problem 1: EGA/VGA hardware initialization
So on to trying to make the test run reliably on XT machines with EGA/VGA video.
The first issue is the cards themselves, or more specifically how to initialize the cards so they show a text screen. While EGA and VGA cards (and "100% compatibles) relevant to an XT or clone are standardized and register-level compatible with each other (meaning that you can write to the display by going direct to the hardware and not using BIOS routines), the initialization of these cards, even by software that doesn't use the BIOS for drawing text or graphics, is pretty much always handled by the BIOS. The very concept of "mode numbers" comes from the number you give the BIOS when asking to switch modes. The VESA BIOS Extensions were "extensions" in the sense that they standardized modes beyond what IBM defined... the "Super" in Super VGA.
Different cards from different (or even the same) manufacturer could use different underlying hardware registers to set up the graphics and text modes. Once configured for a mode, however, the software just wrote directly to video RAM (or in some cases, standardized registers for things like palettes, soft character sets, etc). But to my understanding, the setup of the clocks, RAMDACs, etc. could be very different and not documented to the public. The VESA BIOS provided by VGA cards in particular were specific to each type of card and would set up modes as requested by the BIOS calls.
Moreover, even before anything requests a mode change, the card's hardware must be initialized, and this is again specific to the card and undocumented. There is more than just setting frequencies and resolutions. For instance, EGA and VGA cards which can be connected to monochrome monitors (either MDA or mono VGA) actually remap where their video memory appears in the address space of the machine. To my knowledge, the way this is done is specific to each card, and the code to do it is part of the BIOS. It's not just a table somewhere that we could look into the BIOS either, but code.
Said again, more succinctly: When DOS programs "write directly to the video card" this doesn't include setting the video mode. My understanding is that's always done by the BIOS for DOS-mode programs, even on the most direct-to-hardware games. I know of no available code that can initialize or set screen modes on any (or even most) EGA or VGA cards without using the BIOS, and to my understanding, such code cannot exist. I'd love to find out otherwise, but I'm not expecting to.
Problem 2: the BIOS Data Area
So, why not just tell the BIOS to initialize?
Adrian said that the problem is the stack. That's sort of true, but even if we could find somewhere else for the stack, there's another more serious problem. The defined way for a video BIOS to store its configuration is in the BIOS Data Area (BDA), starting at location 0400h (1KB from the very bottom of RAM). Also, the standard and only reliable way of calling the BIOS is to call its boot-time initialization routine (located at offset 03h from the start of the video BIOS), which itself sets up the interrupt vectors for INT 10h etc. in the lowest 1K of RAM. After that, the way you call into the video BIOS is to issue software interrupts, which use these vectors to find the entry points into the video BIOS.
(Just as a clarification, interrupt vectors are used for two things in the PC architecture. Some of the vectors are called when there is a hardware interrupt, like a hard drive controller card requesting attention. Others are triggered by software. This second form is how DOS and BIOS functions are called. This is done so that software doesn't have to know the exact location of the DOS, BIOS, or driver routines that need to run to service their requests.)
Another concern I have is that the video BIOS can assume that the standard system BIOS is available, and can, I believe, call on its services. I don't know what services the video BIOS might need, but it would at least assume that the BDA is set up and usable.
Further on that line of thought, I've found no references to what environment a video BIOS or option ROM author can assume is present and available from the system BIOS at initialization time. I have made some experiments and on MDA/CGA, and possibly on EGA/VGA if your option ROM is initialized after the video BIOS, you can make INT 10h calls to write to the screen, but you cannot reliably read from the keyboard using BIOS calls. To do that, you have to wait until INT 19h warm boot time, which of course assumes the presence of a complete system BIOS.
These are the reasons why we can't avoid using the lowest portion of RAM when initializing EGA/VGA cards. In reality it is only the lowest 2K that is really needed, but we generally talk about 16K because that is the smallest bank size seen on an XT-class machine.
The current solution: option ROM mode
The approach I've taken so far, and the one that Adrian was announcing as the significant upgrade in the recent video, is for the XTRAMTEST itself to become an option ROM just like the video BIOS. Here's what this means:
During boot, after setting up the BDA but not much else, the system BIOS searches the ROM area for option ROMs by looking for a signature and a checksum. For each one it finds, from low addresses to high, it calls their initialization routine. This is the time when the video BIOS sets up interrupt vectors and so on. When the XTRAMTEST init routine is run, it installs an INT 19h warm boot vector that gives it a chance to "boot" the machine, just like XT-IDE and just like, say, a network card could network boot the machine.
By doing this, we ensure that an EGA or VGA card can initialize itself using its own BIOS routines. Once that has happened, whatever mode it chooses (color or monochrome, with its video RAM located at either 0B0000h or 0B8000h), the XTRAMTEST can find the display RAM, write to the screen, and borrow the 96 off-screen bytes of RAM it needs.
This solves the problem of running on EGA/VGA cards. It does not require "walling off" the first 2K or 16K of RAM, because after the XTRAMTEST starts, we don't need to preserve anything in the BDA or for that matter, interrupt vectors (we disable hardware interrupts and don't make any software interrupt calls).
But we do need to get to that point, which points out the only real disadvantage I'm currently aware of to the way I'm doing things at the moment. We rely on the BIOS to call us, but we don't want the BIOS to do its own RAM testing, and in particular, we don't want it to refuse to boot the machine if it detects RAM errors (which I believe all of them currently do).
So at the moment, you have to have a BIOS that will allow you to at least try to boot to DOS, either because it doesn't detect RAM errors, or because it lets you try to boot anyway.
Solving the "last" problem more generally
What would be nice would be if we could get the BIOS to initialize the video BIOS so long as the first 2K of RAM works well enough to get us that far, even if there are other RAM errors in the same or higher banks.
There are two or thee possible ways to approach this. One would be to extend the XTRAMTEST ROM just enough that it can be a system BIOS for the purposes of calling the video BIOS to initialize the EGA/VGA card, and then jumping to the full RAM test. This may be easy enough for video cards, but if there are any other option ROMs (especially for instance XT-IDE or the super floppy BIOS), they may require a more complete set of BIOS services to operate.
I don't want this RAM test ROM to have to expand into a whole complete BIOS. More to the point, I can't... I just can't dedicate that amount of time (if someone else does, they're encouraged to fork this code and run with it). Even if a minimal set of services can be found, declaring it a sound solution would mean a lot of testing using a lot of different video cards. Adrian can only do so much, and while he's got quite a few EGA and VGA cards, there are plenty of common ones he doesn't have.
A second approach may be to specify that the XTRAMTEST should be loaded as an option ROM immediately above the video BIOS (or at least, there shouldn't be another option ROM like the XT-IDE between them). If this is the case, and the XTRAMTEST can determine that the video BIOS is initialized, maybe by checking that the INT 10h vector is initialized to be somewhere other than into the 0F0000h+ system BIOS area, then perhaps XTRAMTEST can assume that there is a video BIOS that has already been initialized and it can take over and run its RAM tests. Again, since the keyboard routines are not necessarily available, it gets tricky to have this be a boot-time option. Just blithely accessing the keyboard hardware directly is probably unwise, as I can tell (when running under the MAME debugger) that the keyboard has been at least partially initialized, and anything we do there must not prevent the keyboard from working properly when the user does not choose to run the XTRAMTEST at boot time. This has promise and simplicity, but it may be difficult in the general case to tell users how to get the ROM loaded at the right address, which could make it hard for a lot of potential users to use it.
A third approach could be to extend a full system BIOS with XTRAMTEST capabilities in some way. There already exist several available open-source BIOS projects. I'm aware of at least three: GlaBIOS, Super XT/Turbo PC, and Sergey Kisilev's BIOS (by the creator of the XT-IDE and super floppy BIOS, and very interesting but possibly not general enough for just any PC/XT/Clone board).
We can's just add XTRAMTEST code to a BIOS, at least for most users. For many (most? all?) XT-class boards, the system BIOS is an 8KB part, and all XT BIOSes are already at or near 8KB in size. Many boards support socketing in a ROM chip where the ROM BASIC would go, and get the XTRAMTEST at a known address, but not all do.
If we start with one (or more) of the existing open BIOS projects, we could either minimally modify the system BIOS to work in a way more suitible to our needs (just get over the line of initializing the video with minimal RAM testing and then hand control to XTRAMTEST) or more extensively modify the BIOS to integrate the March test in one form or another.
Either of these approaches may work. I should point out that there are wrinkles. First, as far as the BIOS is concerned, all option ROMs are equivalent, whether video, XT-IDE, or XTRAMTEST. We may be able to assume that the video BIOS is located at 0C0000h, and try only initializing that before giving the XTRAMTEST its chance. But these are implementation details.
Conclusion: don't reinvent the wheel, but don't give up either
My conclusion is that we shouldn't try to reinvent the way the boot process works. It would be worthwhile, however, to see if there is a way to tweak one or more of the available BIOSes or see if there is a creative way to install the option ROM to allow a more reliable way to get the XTRAMTEST running on machines with RAM problems and EGA/VGA cards.
While doing this, I do want to keep an eye on what will work for AT architectures too, because I do want at some point to make an ATRAMTEST ROM. While XTRAMTEST will run on the AT, the March test has to know the actual width of RAM (8-bit on XT, 16-bit on AT and 386sx). So the march algorithm will have to be changed to be properly effective on AT-class machines, and it would be nice to find a solution to the EGA/VGA problem that will work on AT machines. I'm not aware of any open-source AT-class BIOS, which will restrict our options there.
One comment @MobyGamer made does make sense: Perform a March test on just the lowest 2K or 16K or RAM during option ROM init time could be useful. It would not prove that the first bank or RAM works, but it would prove that the bottom of RAM is usable so long as nothing higher test used. My worry is that if XTRAMTEST runs before the video BIOS init, we can't display errors on the screen (or, in good conscience, allow the BIOS to continue running initialization including the video BIOS). But if the XTRAMTEST runs after the video BIOS is initialized, it will destroy the BIOS data area and stack, and it will not be safe (or, really, possible) to return to the BIOS and allow it to run. Perhaps in that case XTRAMTEST could temporarily copy the BDA and stack into video RAM, but that's starting to get complex and could lead to reliability problems.
One of the nice side benefits to running as an option ROM is allowing it to stay installed even when you just want to boot and use the machine. It would be nice not to paint ourselves into a corner where that ability is lost.
I'd like to move this to the "Discussion" area, since it's not an "issue" per se. It could lead to creating an "issue" for an enhancement in the future.