Vectorial1024/EliteBionicsFramework

Transpilers approach improvement suggestion

Closed this issue ยท 9 comments

The current approach to transpilers, targeting nth callvirt instruction, is incredibly prone to silent failures if the target method is ever changed. A much more robust alternative would be to use the following condition:

instruction.Calls(AccessTools.Method("BodyPartDef:GetMaxHealth", new Type[] { typeof(Pawn) }))

as well as introducing a failure check at the end, along the lines of:

if (!patchComplete)
{
	Log.Error("EBF: Failed to patch PawnCapacityUtility:CalculatePartEfficiency()");
}

I won't provide a pull request, since for myself I am implementing those directly in the compiled dll via dnSpy. However, these should be trivial to implement, and should provide significant benefits

I do not understand. From my understanding, I or users will know something is up if the method is changed. Consider:

  • some arg type is changed: invalid IL code error (reason: cannot pass type X to type Y)
  • arg count is changed: invalid IL code error (reason: too many items on the eval stack or not enough items on the eval stack)
  • method name is changed: Harmony failed to patch code (reason: method not found)
  • new method overload added: a variant of "arg count is changed"

In practice, because body part HP is so core in the game, code changes are very unlikely. The patches are likely to work with no action needed from my side even if a new game version is released. The problematic part would be mod integration breaking, but that happens rarely as well.

But still, due to recent severe irl problems, modding progress has stalled greatly (not just RimWorld modding, other games as well). I will not be addressing this in high priority. I will however leave this open as some sort of interesting technical research.

PS: if the method is changed as in the inside code is rewritten, then I will have to read the new code to remake my patches anyways. There are no guarantees there would be the same number of method calls, much less the max ldfld number. Being brittle is pretty much intended.

The patch will fail and users (preferrably with error log enabled; also, red errors will always force display the debug log) will see errors on the title screen. Your Log.Error approach has no intrinsic differences from the "delegate to Harmony to announce patch failed" method. Knowing that a patch somehow failed plus a relevant stacktrace (currently also contains the patcher's information) is already sufficient for me to start work, further info is unhelpful.

Does Harmony actually report invalid IL code fromthe transpiler? If so, this issue is indeed much less impactfu than it seemed. My understanding was that it would silently accept the patch but fail at the first execution of the patched function.

This is, in fact, something that seems to have happened to me, as I got a NullReferenceException every time I loaded a save. With the help of the wonderful WhatThePatch mod, I was eventually able to deduce that EBF was responsible for transpiling one of the functions in the traceback. Now that I've changed most of the transpilers to use amore robust method, the exception no longer occcurs, so I assume this was indeed the cause. I've verified the code of the original function, and the target call does seem to have the same callvirt index, so this is likely a conflict with another transpiler.

The problem with those is, you can't really read their resultjng code in advance. However, a transpiler is unlikely to change existing locals of a function, so a patch expressed as "replace every call to BodyPartDef.GetMaxHealth with a push of a BodyPart from this local, and then a call to my replacement function" should work regardless of other transpilers. The only thing that could break it is a core Rimworld update, but as you've said, it's rather unlikely to change those particular functions.

Aslo, since creating the issue, I've found an even better way to express those patches, both more elegant and more robust. In fact, I assume you're already aware of it, since I found it in your code ๐Ÿ˜„ :

return new CodeMatcher(instructions)
.MatchStartForward(
new CodeMatch(OpCodes.Callvirt, AccessTools.Method(typeof(BodyPartDef), nameof(BodyPartDef.GetMaxHealth)))
) // find the only occurence of .GetMaxHealth()
.InsertAndAdvance(
new CodeInstruction(OpCodes.Ldloc_3),
new CodeInstruction(OpCodes.Callvirt, typeof(Hediff).GetProperty("Part").GetGetMethod()),
new CodeInstruction(OpCodes.Call, typeof(VanillaExtender).GetMethod("GetMaxHealth"))
) // insert extra code so that we use VanillaExtender.GetMaxHealth(); we do this out of convenience
.Set(OpCodes.Nop, null)
// and ignore the original instruction
.InstructionEnumeration();

I ended up using similar CodeMatchers for the core transpilers, rather than the approach I originally proposed. While they are identical in function, this one comes with significantly less boilerplate. (If you wish to patch an unknown number of calls to GetMaxHealth this way, CodeMatcher also has a Repeat method, althogh I haven't used it myself yet).

Ah yeah the CodeMatcher, I basically forgot about it.

The plan was to eventually change every transpiler to at least use the CodeMatcher method, but irl got in the way and I basically forgot about the plan. Now it is half way complete, and there is currently no ETA of completion.

I do hope irl can finally unblock to let me finish things...


But still, do you happen to discover which other mod is conflicting with the transpiler? I got random separate reports of Pawnmorpher seemingly not working with EBF and I have not yet started any investigation.

I'm not using pawnmorpher, so it's something else, but I'm not sure what. I can't find the WTP log from the time of the issue, and I can't reproduce it anew since I've already fixed the dll. Sorry(

Ah yeah you can check more transpiler-related progress over here #63

For some added info, this time in the 1.5 compatibility update, one old-style transpiler patch did silently fail and caused a cursed reflection bug. Normally a failed patch will be immediately indicated by a "invalid IL code" exception, but this time it somehow passed through the IL code check and resulted in several variables having unpredictable contents and types in it.

I guess there is a reason to do the new-style transpilation after all.

With CodeMatcher, if the patch failed due to no pattern matched, then it gives "arg out-of-range" exception because the matcher would have already reached the end of the function (pointer now pointing at emptiness) and I would be trying to set the IL code at pointer to become nop. Then, I/user will know.

Still, for the CodeMatcher Repeat usage, you may check commit 73b02a2 ; it turns out a transpiler patch has to patch the same code several times.

This is finally complete.

Regarding the "silently fail" part, I noticed something.

Must have been a past RimWorld update, but if users do not enable development mode, then even if there are red errors, the debug log will never appear. This makes failed patches silent by virtue of having no one to see them.

Might need to somehow improve the communications.