isledecomp/LEGOIslandRebuilder

Consider switching to binary signature searching

Closed this issue · 1 comments

As of right now, every offset to be patched in the input file is hardcoded. Around 200 lines of code in here are just there to figure out where to write to, and which game version has been loaded. Switching to signature searching would make this unnecessary.
As Issue #18 proposes porting to C++ for better compatibility, this would also remove the dependency to SHA1 hashing, which is currently used to determine the game version.

The basic idea behind this is, you take a few bytes (enough for them to be unique) somewhere around code you want to patch, create a mask that allows for some bytes in there to change (if necessary) and search for them in the entire binary. if you're patching rdata, you just create a signature for code that references the memory you want to patch.
A naive implementation in C++ would look like this.

std::vector<uintptr_t> find_in_memory(std::vector<std::optional<uint8_t>> const &signature, uintptr_t const start, uintptr_t const end)
{
	std::vector<uintptr_t> results;

	for(uintptr_t cur = start; cur < end; cur++)
	{
		auto size = signature.size();
		for(size_t rel = 0; rel < size; ++rel)
		{
			auto &b = signature[rel];
			if(!b.has_value())
				continue;

			if(*(uint8_t*)(cur + rel) != *b)
				break;

			if(rel == (size - 1))
				results.emplace_back(cur);
		}
	}

	return results;
}

I would happily volunteer to implement this and create all the patterns necessary, is there any place i can find all the game binaries used in here?

Indeed, we're actually already doing this in the C++ rewrite in the mfc branch, which should supersede the old C# version fairly soon.