dfloer/SC2k-docs

Can't extract sprites from LARGE.DAT using sprite docs

JPLeBreton opened this issue · 3 comments

Thank you for creating this resource. I'm in the very early stages of a project to create large screenshot map images from .SC2 city files and the original game data with a bit of Python code, and am having trouble with "read sprite tiles from LARGE.DAT" step of that process.

Using the info from this page, I was able to unpack the master SC2000.DAT file into 399 smaller files, including LARGE.DAT. I'm reasonably confident that my code for extracting everything is correct; all the other data like TXT and VOC sounds is perfectly readable and doesn't seem to start at weird offsets or have junk data.

However it's when trying to parse the tiles in LARGE.DAT using your info in the Sprite File Specification where I run into problems. When I scan the first 16 or so bits of the header, the numbers I get don't make sense:

10 07 04 0F | 0C 02 6B 6B | 10 0B 04 0E
1808 3844 | 1802174988 | 2832 3588

According to the docs, this would mean that there are 1808 entries in the file, that the first one starts at 1802174988 (far bigger than the file itself) and is 3588x2832 pixels big. When I use big endian order, many of these numbers are even bigger.

This led me to believe that my LARGE.DAT data was different from the version from which these docs were written, or maybe even that my SC2000.DAT file is different or that I extracted the data from it incorrectly. Just in case, some info on my game data:

My LARGE.DAT is 638,200 bytes. My SC2000.DAT is from the DOS version of the game, 2,629,981 bytes, with MD5sum 2f982a1f4e6041203d8460a289193766 (though yours could easily vary because there is user-specific data stored in that file, as USER.DAT). Date stamp is 16 December 1994.

I also notice that I have no SMALLMED.DAT or SPECIAL.DAT, as mentioned in the docs - just LARGE.DAT, SMALL.DAT, and OTHER.DAT.
I do however have a .HED file that corresponds with each: LARGE.HED, SMALL.HED, OTHER.HED
However the first several bytes of these .HED files are all 0xff, and don't seem to fit any of the data as described.

Any idea what I could be doing wrong?

Thank you again. If providing this kind of support falls outside the scope of your intent for this project, feel free to close this issue.

I see that my docs are a bit lacking. They're written for the Windows 95 Special Edition release of the game, as that's the one I play and have been targeting my efforts towards, versus the DOS version. I've updated them to make this clear.

With that fixed, I suspect the file structure is probably pretty similar. I strongly suspect that for the DOS version, you'll need to parse the .HED segment/file which will contain the offsets you're looking for. I can't commit to being able to look at this anytime soon, but if you'd like help, I can probably provide some.

My first suggestion would be to open up the LARGE.HED in a hex editor and start seeing if there is anything that looks similar, or repeating patterns such as in LARGE.DAT.

The first 16 bytes of my file is 01 F5 04 E8 00 00 13 94 00 11 00 20 04 E9 00 00 which decodes to 501 total entries, first entry is id 1256, its offset is 5012, width is 32 and height is 17. Then it starts on the second entry which is id 1257, and has the first half of the offset.

Ah, this all makes sense now, thanks very much. It looks like the Windows 95 version of LARGE.DAT is widely freely available via the SC2000 Win95 demo version, along with the color palette file. So I think I can do everything I need with that data and your documentation. Thanks again. If I ever delve into decoding the differences with the DOS data formats, I'll write it up and let you know.

If you need any tips on coding the renderer, I'd definitely like to help. The first thing I can suggest is don't do a tile based approach to rendering things, do it based on the sprite.

What do I mean by this? Well, I started writing it based on tiles, where I'd split up the large sprites into one tile wide slices and then composite them in. Not only is it slower, but it also doesn't handle things that are larger than one tile but conceptually only one tile in size (there are only a few examples, notably the cargo ships and airplanes) but the sprite is larger. I got a decent way through it and then hit that roadblock.