Palm-Studios/sh3redux

pic.arc weird texture headers

Quaker762 opened this issue · 29 comments

So I was trying to load a few textures to try and bind onto a quad, when I noticed I was getting strange reads from sh3_texture. As it turns out, pic.arc contains a weird header (I've taken to calling '40-bytes of bullshit') before the actual header, as seen below..

Bullshit

It would seem not all textures have this, and the only way to check would be to dump every texture file from pic.arc, which is possible.

How do you suggest we tackle this? I can always e-mail Mike or some other SH3 modders if we get too stuck. I thought it may have been the elusive colour palette for 8-bit textures, but it seems it's the same 40-bytes, even for 24-bit textures...

some of the textures have this extra header but it appears to be random (there are files with the same size and color depth with and without the header).
The only way to be sure is to edit the arc with the same file after removing the header to see if the original game treats it differently.

@piratesephiroth

I just tried this by removing the header from data/pic/sy/sys_title.tex, which is the main menu's SILENT HILL 3 background. This was the result....

wew...

It also caused all files stored after data/pic/sy/sys_title.tex to not draw, but I think is because of the difference in arc offsets. Still, this is really strange, I'll have to investigate this. I have no clue as to how they work out which textures have this extra header, perhaps it's for static background images? (Meaning we'll need a ANOTHER class for that as well....)

Also, here's what happened when I removed the header from sys_warning.tex (which comes AFTER the title in pic.arc, explaining why that still draws correctly)

http://sendvid.com/xre0amt9

EDIT: I just did a quick count, out of 301 .tex files in pic.arc, 252 contain the A7A7A7A7 header.

z33ky commented

Even if I replace the whole header with ´FF´ it still works fine. Perhaps this header was used in another game on the same engine, or just for the PS2 version?
However if I modify the arc.arc to let sys_title_tv.tex (that is the title image for the demo) point to a file that does not have the additional header (I used tpGB.tex, section 7, index 7 in the demo) the game crashes for me. That would suggest some hard-coded shenanigans, unless tpGB.tex is just broken anyways. But it seems to be the only texture without that A7 header in the demo.

Just to make things work for now we could just check for the presence of the A7A7A7A7 byte sequence and if it's there, skip the 40 bytes.
We could dump all images as bitmaps, sorted in different directories depending on if they have this additional header, and check if there is a pattern; Perhaps your assumption that it's for background images is correct.

Even if I replace the whole header with ´FF´ it still works fine.

Ahh, I neglected to think I was upsetting the offsets in arc.arc... You're right it still draws correctly.

That would suggest some hard-coded shenanigans

I reckon it's this. You're probably right in saying it's something for the PS2 version. The dev team for the PC version was separate from the PS2 one, so they probably got a whole lot of assets dumped on them and told to make it work.

Just to make things work for now we could just check for the presence of the A7A7A7A7 byte sequence and if it's there, skip the 40 bytes.

Yeah I just put this in. I'm redoing the texture class (again), because I managed to find some code where someone had actually worked out how to decode 16 and 8-bit textures.. Turns out the palette comes at the END of the file and is stored in these weird colour blocks.. Super bizarre... Oh, and some of them are compressed and have a weird form of 'encryption'. They must have REALLY not wanted people to try and mod the game..

We could dump all images as bitmaps, sorted in different directories depending on if they have this additional header

I'll get on this. I'll write a few tools to dump .arc stuff to actual files (so we don't have to clutter up main.cpp) over the next few days.

Perhaps your assumption that it's for background images is correct.

Perhaps. I noticed that it seems to exist on all the textures where the image is drawn to the entire screen. Also, it would seem embedded textures (in the .mdl files) don't contain the header.

Either way, it seems bizarre, as DWORD 2 contains the header size (0x40). I'll just do a check for it and set an offset to the real header + data

Also note that the file size actually takes this header into account, though this could be because of the tools they used to pack the arc files

As far as I remember the PC version uses the same exact files as the PS2 one (even the unused stuff is still there). They only added support for loading their raw BMPs and encrypted the videos.
I guess that extra header is what remains from a different implementation of the textures that's completely ignored and was kept just for compatibility (it's probably the same on the PS2 game as well).

As far as I remember the PC version uses the same exact files as the PS2 one (even the unused stuff is still there)

I noticed this as well and thought it was hilarious seeing _work directories and test.tex files haha. It wouldn't surprise me if this was for some necessity for the PS2 version.

Do you know of a way I can view/rip the assets of the PS2 version from my disc?

In the PS2 version the assets are packed in the mfa files. They're simpler than the arc files from the PC version.
I made a QuickBMS script years ago to unpack them, here

I made a QuickBMS script years ago to unpack them, here

Cheers I'll try it out later.

Hi
Those 'encrypted' textures are actually swizzled. Here is some info about http://ps2linux.no-ip.info/playstation2-linux.com/download/ezswizzle/

z33ky commented

Thanks @belek666. We still have a 16-pixel wide column on the left that we do not decode correctly with our implementation. Hopefully this information will shed some light on it.

Hi @belek666

Interesting, so it's not a form of quasi-encryption, but rather swizzled textures decoded on the GPU (on the PS2 version)?

The more and more we dig, the more and more we're seeing how much of a pure port of the PS2 version this actually is.

Thanks for the heads up, and as @z33ky said, this should help us with the horizontal 16-pixel shift we currently have with the decoding of the 8-bit paletted textures.

They used it to speed up uploading textures to ps2 vram. There is source code of gs memory simulator in GSTextureConvert-1.1.zip which can be used to swizzle or unswizzle. I've used it to write converter for textures from silent hill 2 & 3 with those function:

void unswizzle(IMAGE_INFO *image)
{
     if(image->bpp == 8)
     {
        int rrw = image->width / 2;
        int rrh = image->height / 2;    
            
        writeTexPSMCT32(0, rrw / 0x40, 0, 0, rrw, rrh, image->image_data);
        readTexPSMT8(0, image->width / 0x40, 0, 0, image->width, image->height, image->image_data);
     }
     else if(image->bpp == 4)
     {
        int rrw = image->width / 2;
        int rrh = image->height / 4;    
            
        writeTexPSMCT32(0, rrw / 0x40, 0, 0, rrw, rrh, image->image_data);
        readTexPSMT4(0, image->width / 0x40, 0, 0, image->width, image->height, image->image_data);
  
     }
}
void swizzle(IMAGE_INFO *image)
{
     if(image->bpp == 8)
     {
        int rrw = image->width / 2;
        int rrh = image->height / 2;    
            
        writeTexPSMT8(0, image->width / 0x40, 0, 0, image->width, image->height, image->image_data);
        readTexPSMCT32(0, rrw / 0x40, 0, 0, rrw, rrh, image->image_data);
     }
     else if(image->bpp == 4)
     {
        int rrw = image->width / 2;
        int rrh = image->height / 4;    
            
        writeTexPSMT4(0, image->width / 0x40, 0, 0, image->width, image->height, image->image_data);
        readTexPSMCT32(0, rrw / 0x40, 0, 0, rrw, rrh, image->image_data);

     }
}

Do you plan to use game's fonts? Their compression algorithm is the same as sh2 fonts. I can send you unpacker code adopted to sh3 file format.

Hi @belek666

Do you plan to use game's fonts? Their compression algorithm is the same as sh2 fonts. I can send you unpacker code adopted to sh3 file format.

Yes we do!! Any information/help you give would be greatly appreciated! Perhaps it would be easier for you to open a Pull Request (if you have time). If not, just send the files through, it'll be a great help! :)

As a side note, where exactly are the fonts located in?? I could never seem to find them in any of the .arc sections.

Fonts are in msg.arc in files:

  • fontdata_c.bin, it has english and chinese characters
  • fontdata_j.bin, it has english and japanese characters
  • fontdata_k.bin, it has english and korean characters

In each file there are two set of fonts with different height. There is also some data at end but I don't know what it is. Character encoding correspond to text in msg files.

fontsh3.zip

Thanks very much for the contribution, I'll have a look at this later.

@belek666

So is each character/glyph stored as a texture that can be sent to the GPU (from the looks of the pixel is grabbed from the palette when the character is decoded).

Yes, I'm not sure if pallete is correct because I didn't find it in sh3 files and I took it from sh2. After decoding you get raw rgba32 data.

Okay cool, when I've got some time that is being consumed by uni I'll have a go implementing this + a shader to draw them.

Could I get your full name so I can add it to the file comment (the @ author tag)

@belek666

Hi,

I'm having a bit of trouble testing the code provided. If I'm correct, font.cpp dumps a single glyph into font.dat, yes? However, when I try and draw character 32 here set to RGB32, ignore alpha off, I get some strange output (image below, I'm using fontdata_j.bin as the input file). I'm using the glyph width and height, so maybe that's the issue?

Could you possibly elaborate on the format a bit more? Where are the English characters stored in relation to the Asian ones? Nevermind I just realised you stated that indexes are relative to the message file character indexes/values.

wew

So its ok now? As for credits, unpacking algorithm was given to me by Dencraft author of "Silent Hill Font Editor" (program for sh2 fonts, never finished). I adopted it to sh3 format. If you want you can credit me by nickname.

As for credits, unpacking algorithm was given to me by Dencraft author of "Silent Hill Font Editor" (program for sh2 fonts, never finished). I adopted it to sh3 format. If you want you can credit me by nickname.

Done

So its ok now?

I'm still having issues drawing the raw data. Is there any specific way it needs to be rendered? I can see that it at least has the same amount of colours that a single glyph would have, but it's still all garbage.

wew

Function "DecodeChar" unpacks single glyph by specific char encoding number in argument. In my code, in main function there is "numChar" which define with char will be decoded. Its set too high and that why you are getting garbage data. For example: for "a" set "int numChar = 'a' - 0x20;"

@belek666

Hi, sorry for the late reply.

I'm still getting garbage, even with int numchar = 'a' - 0x20. Even stranger, when I use fontdata_j.bin, I get a segmentation violation (on Windows).

The following image was taken with fontdata_k.bin with numchar set to 'a' - 0x20 (char 65)

wew

The following was taken using fontdata_j.bin (ignore the background, it's the previous character)

wew

The main thing I'm surprised about is the character glyph corruption.

Are you developing on Windows or Linux? I don't really have much time to debug, but I'll give it a crack on Linux in the next few days.

I'm using Windows 7 64bit. If you didn't change program code its probably broken fonts files. For "a" from fontdata_j.bin you should get:

font

If you want I can check yours font files. What program did you use to extract files from msg.arc?

I'm using our own code found in SH3/arc/*.cpp... Paging @z33ky , we might have some issues in the arc loading section again....

Anyway, here's the font files and corresponding MD5 Checksum (for good measure) P.S Sorry for using a file sharing site, but apparently ZIPs aren't supported on GitHub, even though they are... rollseyes

http://s000.tinyupload.com/index.php?file_id=05835222736118477031

fontdata_j.bin MD5: E14FF96C7AD917A25EF6BB23C5A83A40
fontdata_k.bin MD5: 1BFEAFB03FD63E508E498F3BB36323E1

You have something wrong with arc file management. It looks like it adds extra bytes, even size of file is bigger.

@belek666

I got it fixed. It's now fully implemented in the engine! :) Turns out the code to actually dump the file to the disk was faulty.
font_test

@Quaker762 Nice work! 👍