GothicKit/ZenKit

Issues with TheChroniclesOfMyrtana mod

Try opened this issue · 23 comments

Try commented

I'm looking into KoM startup routine, to see if it's possible to have adequate support for this mod.
Before phoenix, I was able to react stating location and play tutorial segment on a ship.

Issues so far:


  • Crash in symbol::set_int
    C++ side:
void symbol::set_int(std::int32_t value, std::size_t index, const std::shared_ptr<instance>& context) {
	...
	if (is_member()) {
		...
	} else {
		// _m_value is null
		std::get<std::unique_ptr<std::int32_t[]>>(_m_value)[index] = value;
	}
}

Script call-stack:

"HOOKENGINEF"
"INIT_FEATURES_GAMESTART"
"INIT_GAMESTART"
"INIT_GLOBAL"

Apparently it's related to assignment to HOOKENGINEF.FUNCTION (function override?)


  • No error handling in opcode::div and opcode::mod

If script divides by zero - whole application crashes


  • Crash in allocate_instance, when called for "$INSTANCE_HELP"
    Most likely not intended by script - happens late in initialize routines, and everything is supper broke there, due to no DMA support.
auto* parent = find_symbol_by_index(sym->parent()); // parent ==nullptr here
while (parent->type() != datatype::class_) {
	parent = find_symbol_by_index(parent->parent());
	}

  • illegal mutable access of const symbol
[phoenix] vm: internal exception: illegal mutable access of const symbol WIN_GETLASTERROR.CALL
[phoenix] vm: internal exception: illegal mutable access of const symbol KMLIB_INITIALIZEGAMESTART._KMLIB_INITIALIZEGAMESTART_PTR
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_RETVALISFLOAT
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_PUTRETVALTO
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_NUMPARAMS

Proposed fix: introduce vm_allow_const_cast_access, flag similar to vm_allow_null_instance_access


  • Minor issue with de-compiled sources:
    Local variables were displayed as global. I'm aware, that Daedalus has no real stack-variables, so just readability issue.
// CATINV_NEXTNONACTIVEITEM.* should be local variables
var int CATINV_NEXTNONACTIVEITEM.I = 0;
instance CATINV_NEXTNONACTIVEITEM.L(ZCLISTSORT)
instance CATINV_NEXTNONACTIVEITEM.ITM(C_ITEM)
func int CATINV_LASTNONACTIVEITEM(var int LIST, var int MAX) {
    ...
}

  • Allow for receiving symbols directly
    This is required to implement control-flow functions in ikarus. Basically my plan is to:
    vm.override_function("repeat", [this](phoenix::symbol* var, int count){ repeat(var,count); })
    And access(and memorize) var as a reference to loop-counter.

Proposed solution: remove check for phoenix::symbol in check_external_params, rest seem to work as-is


  • Some stack introspection?
    Followup to repeat. All together makes workflow:
// script
repeat(i, 32)
...
end; // variable-token

// c++
void Ikarus::repeat(phoenix::symbol* var, int count) {
  auto end = vm.find_symbol_by_name("end");
  if(end==nullptr)
    return;
  auto& called_from = ...; // need to get parent context
  for(uint32_t i=called_from.pc(); i<vm.size(); ) {
    auto inst = vm.instruction_at(i);
    if(inst.op==phoenix::opcode::pushv && inst.symbol==end->address()) {
      // inject jump-back to have a loop
      break;
      }
    i += inst.size;
    }
  }

PS:
Probably it's better to have one ticket instead of million small ones.

Thanks, I'll look into it. Just real quick:

Minor issue with de-compiled sources:

I do have an updated version of the decompiler up on lmichaelis/phoenix-studio. I'll redo the decompilation for tcom-decompilation in just a sec.

Done. Files are now more helpfully labeled. The code you are referencing here can now be found in Source/CATINV_NEXTNONACTIVEITEM.d:

func int CATINV_NEXTNONACTIVEITEM(var int LIST, var int MAX) {
    var C_ITEM ITM;
    var ZCLISTSORT L;
    var int I;
    I = 0;
    WHILE((LIST) && ((I) < (MAX)));
    L = _^(LIST);
    if (HLP_IS_OCITEM(L.DATA)) {
        ITM = _^(L.DATA);
        if (((ITM.FLAGS) & (ITEM_ACTIVE)) != (ITEM_ACTIVE)) {
            BREAK;
        };
        I += 1;
    };
    LIST = L.NEXT;
    END;
    return I;
}

You can find the old version of the decompilation in the old branch.

Try commented

Done. Files are now more helpfully labeled.

Just checked out new code. Apparently LeGo.d is gone, and maybe other library files.

Yes they are now scattered throughout Source/ I have to classify them again. They should be easy to pull out though since the filenames are now actually helpful.

Try commented

I've added 2 more things for control flow.
Basically need to develop a way to execute at loops:

repeat(i, 32) // ikarus-function. Here need to have a way to memorize 'i' and scout byte-code for 'end'
<body>
end; // variable-token. Currently thinking of replacing it to `be` + `b`

Here I'm planning to inject native function in replace of 'end;'

Crash in symbol::set_int

This should be fixed now. Because the script will assign function parameters to local variables, this caused issues with higher-order functions since no memory was actually allocated for the local (here HOOKENGINEF.FUNCTION):

image

Crash in allocate_instance, when called for "$INSTANCE_HELP"

Yeah, that would be a null-pointer de-reference. I've replaced with an exception :)

illegal mutable access of const symbol

This will require quite a bit of rework in the script. Currently, it's not even possible to change the value of a const symbol manually which would probably be a good thing to have anyways. I'll have to move the const-ness check into the VM and probably create a new exception type for it.

Allow for receiving symbols directly

This already works for instances. It won't work for general variables right now since I can't verify that that external will actually be called with a variable (it could just be called with a literal).

If you'll scout the bytecode for end; anyways, couldn't you just loop inplace without touching the variable?

Some stack introspection?

Basically, you just need access to the call-stack and a way to set the program counter, right?

Try commented

Currently, it's not even possible to change the value of a const symbol

Are you sure? At least in place, where exception been hit, there is access:

void symbol::set_int(std::int32_t value, std::size_t index, const std::shared_ptr<instance>& context) {
    if (is_const()) {
	//throw illegal_const_access(this);
	}

It won't work for general variables right now

I've checked source code, seem to be OK: vm::pop_reference() is simply throwing exception, for immediate variables. This is more than fine for use-case. Loop counter can't be a literal anyway.

If you'll scout the bytecode for end; anyways, couldn't you just loop inplace without touching the variable?

I don't really see how... Yet, I'm welcome for ideas, because scouting byte-code is clearly not a great workflow.
For clarity plan is: implement conditional jump at repeat and increment+jump at end. end.

As a side idea: have a naked calls, with no stack entry, for override_function can be an alternative. Probably a better one even:
end can be implemented than as pushv %i + be[naked] to implement +1 and jump.

Are you sure? At least in place, where exception been hit, there is access:

Yes, that's what I mean. If you try to set the value of a const variable, it will throw. That would have to change, of course, as I mentioned.

Loop counter can't be a literal anyway.

In this case, yes. That's not safe everywhere though. Maybe there is some way around making function overrides unsafe like that though, not sure.

I don't really see how...

If you override repeat, then you'll get control at the address. You could then scout for end; and manually execute the instructions in between (of course, this would require access to vm::exec() or some way of running code until a given pc is reached). You just repeat that until i == 0. No bytecode tinkering necessary. That would be difficult to do anyways, considering that jumps reference a fixed byte offset from the beginning of the script's code section.

Try commented

If you override repeat, then you'll get control at the address

I need to think about it. So far not going smooth: repeat is still a function call, so pc state will be restored after call is complete.

You just repeat that until i == 0

Nit: other way around. repeat(i,32) is equivalent to for(; i<32; ++i).

So far naked call, with no stack frame seem to be a good versatile solution.. except they need access to vm-variables stack to be usefull...

Try commented

Testing latest version of phenix. Apparently phoenix::vm::init_instance is broke, always throwing exception:

1   libstdc++-6!.cxa_throw 
2   std::__throw_bad_variant_access                                                                                                                                                    variant              1305
3   std::__throw_bad_variant_access                                                                                                                                                    variant              1313
4   std::get<3ull, std::unique_ptr<...>>                                                                                                                                               
5   std::get<std::shared_ptr<phoenix::instance>, std::unique_ptr<...>>                                                                                                                 
6   phoenix::symbol::get_instance                                                                                                                                                      script.cc            415 
7   phoenix::vm::init_instance<phoenix::c_spell>                                                                                                                                       vm.hh                210 
8   phoenix::vm::init_instance<phoenix::c_spell>                                                                                                                                       vm.hh                194 
9   operator()                                                                                                                                                                         spelldefinitions.cpp 12  
10  std::__invoke_impl<void, SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>&, phoenix::symbol&>(std::__invoke_other, struct {...} &)                     invoke.h             61  
11  std::__invoke_r<void, SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>&, phoenix::symbol&>(struct {...} &)                                             invoke.h             111 
12  std::_Function_handler<void(phoenix::symbol&), SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>>::_M_invoke(const std::_Any_data &, phoenix::symbol &) std_function.h       291 
13  std::function<void (phoenix::symbol&)>::operator()(phoenix::symbol&) const                                                                                                         std_function.h       560 
14  phoenix::script::enumerate_instances_by_class_name(std::basic_string_view<char, std::char_traits<char>>, std::function<void (phoenix::symbol&)> const&)                            script.cc            189 
15  SpellDefinitions::SpellDefinitions                                                                                                                                                 spelldefinitions.cpp 10  
16  std::make_unique<SpellDefinitions, phoenix::vm&>                                                                                                                                   unique_ptr.h         962 
17  GameScript::initCommon                                                                                                                                                             gamescript.cpp       283 
18  GameScript::GameScript                                                                                                                                                             gamescript.cpp       73  
19  GameSession::GameSession                                                                                                                                                           gamesession.cpp      124 
20  operator()                                                                                                                                                                         mainwindow.cpp       898 
...                                                                                                                                                                                                                                                                                                         

Seems symbol::_m_type somehow become inconsistent with symbol::_m_value

Offended commit: f0d6751

Whoops. And that's why I need script tests. It was an incorrect switch fall-through. Fixed in dc0e340.

You just repeat that until i == 0

Nit: other way around. repeat(i,32) is equivalent to for(; i<32; ++i).

So far naked call, with no stack frame seem to be a good versatile solution.. except they need access to vm-variables stack to be usefull...

So after thinking about the symbol* parameter thing for a bit, I think that there is no way around it. I'll implement it if you think that's the way to go. As for my idea on how to implement repeat even without naked calls, see below:

(Daedalus code)

func void do_something() {
   var int i = 0;
   
   repeat(i, 10);
   wld_settime(10, i); // or whatever
   end;

   return;
}

(C++ code)

void setup(phoenix::vm& vm) {
  // misc vm setup ...

  vm.override_function("repeat", [&vm](phoenix::symbol* sym, int n) {
    // introspect call stack
    auto caller = vm.caller();

    // scout bytecode for END; (you might want to cache the result)
    auto pc = caller.pc;
    auto instr_count = 0; 
    for (auto i = 0u; i < 50 /* arbitrary limit */; ++i) {
      auto instr = vm.instruction_at(pc);
      
      // either check for symbol index directly ...
      if (instr.op == phoenix::opcode::pushv && instr.symbol == 0x165b) {
          break;
      }

      // ... or for symbol name
      if (instr.op == phoenix::opcode::pushv) {
          auto* sym = vm.find_symbol_by_index(instr.symbol);
          if (sym != nullptr && sym.name == "END") {
              break;
          }
      }

      pc += instr.size();
      instr_count += 1;
    }

    // run the original instructions n - 1 times
    for (int i = sym->get_int(); i < n - 1; ++i) {
        sym->set_int(i);
        vm.unsafe_exec_from(caller.pc, instr_count);
    }

    sym->set_int(n - 1);
    // exit; the bytecode after repeat() will run one more time
  });
}

Obviously, the APIs vm::unsafe_exec_from() and vm::caller() don't currently exist but they would be easily implemented.

Try commented

As for my idea on how to implement repeat even without naked calls, see below:

Hm, can't say that I like it:

  • Not clear how-to handle break and continue, since vm.unsafe_exec_from(caller.pc, instr_count) is complete back-box.
  • Not clear how-to deal with zero-sized loop "the bytecode after repeat() will run one more time"
  • In principle, similar take can be done already, by inheritance, from phoenix::vm and accessing exec, push/pop and etc

Naked call also do suffer from inability to use push/pop directly.

You're right I totally forgot about continue; and break;. With those in might, I believe sub-classing the VM is actually a good idea. Here's my new take.

Things considered

  • repeat()
  • continue;
  • break;
  • end;

The idea

Let's subclass phoenix::vm and override push_reference. Also use override_function to override repeat() using a symbol* to reference the loop variable. Example (only supports a single nested loop):

class my_vm : phoenix::vm {
public:
    my_vm(phoenix::script&& s) : phoenix::vm(std::forward(s), /* flags ... */) {
        this->override_function("repeat", [this](phoenix::symbol* s, int n) {
            return this->repeat(s, n);
        });
    }
 
    void push_reference(phoenix::symbol* sym, uint8_t i) override final {
        if (sym->name() == "END" && _m_loop_begin) {
            this->_m_loop_end = this->pc();
            this->jump(*_m_loop_begin);
            return;
        }

        if (sym->name() == "CONTINUE" && _m_loop_begin) {
            this->jump(*_m_loop_begin);
            return;
        }

        if (sym->name() == "BREAK") {
            this->jump(this->find_loop_end());
            // reset
            _m_loop_begin = std::nullopt;
            _m_loop_end = std::nullopt;
            return;
        }
 
        phoenix::vm::push_reference(sym, i);
    }

private:
    uint32_t find_loop_end() {
        // discover `pushv END` like in my last comment and assign it to _m_loop_end
        return *_m_loop_end;
    }

    phoenix::naked_call repeat(phoenix::symbol* i, int count)
        if (!_m_loop_begin) {
             // save loop start
             _m_loop_begin = this->caller().pc;
        }

        if (i->get_int() >= count) {
            // jump to loop end
            this->jump(this->find_loop_end());

            // reset
            _m_loop_begin = std::nullopt;
            _m_loop_end = std::nullopt;
            
            return {};
        }

        // increment i and continue loop
        i->set_int(i->get_int() + 1);
        this->jump(this->caller().pc);
        return {};
    }

private:
    std::optional<uint32_t> _m_loop_begin, _m_loop_end;
};

Obviously there are some optimizations possible here (like looking up the symbol indices of CONTINUE, BREAK and END beforehand) and it only supports one nesting level but those things can be implemented fairly easily. This would also give you the power to change implementation details as you like.

I'm not sure how this process would work for other custom Union constructs but it seems plausible that something like this could work too. It's also quite fast since you only have to scout the bytecode for END if you never reach it during loop execution which should be relatively rare.

Changes needed in phoenix

  • Function overrides, conditionally without PC-restoration ("naked calls")
  • External functions and function overrides can use symbol* for any of its arguments
  • Some functions of the VM marked virtual
  • VM function caller()
Try commented

I've pushed prototype for loops support.

CONTINUE, BREAK and END are tracked in VM (guarded by flag), assuming that it's more efficient than virtual functions based approach.
Added unsafe_jump function and naked calls to assist with repeat function.

Already done some testing locally - application side is clean and simple loops do work. Testing of CONTINUE, BREAK is todo.

Try commented

One more big feature to discuss is dynamically allocated memory. Some ikarus code for context:

instance _HT_CREATEPTR.ARR(ZCARRAY)

func int MEM_ARRAYCREATE() {
    return MEM_ALLOC(SIZEOF_ZCARRAY);
}

func int _HT_CREATEPTR(var int SIZE) {
    PTR = MEM_ARRAYCREATE();
    ARR = _^(PTR);

    ARR.ARRAY      = MEM_ALLOC((SIZE) * (4));
    ARR.NUMALLOC   = (SIZE) * (4);
    ARR.NUMINARRAY = 0;
    return PTR;
}

OpenGothic can hook into MEM_Alloc/MEM_Free and emulate 32-bit allocator.
However _^(PTR) is non-trivial. PTR has backing in plain memory managed by OpenGothic, but has no std::shared_ptr<phoenix::instance> wrapper.

Another issue is class-declaration. Class ZCARRAY is defined by script and engine should not be aware of it. So need some solution to threat some classes as plain-structs, with no per-field bindings

Another issue is class-declaration. Class ZCARRAY is defined by script and engine should not be aware of it. So need some solution to threat some classes as plain-structs, with no per-field bindings

There would be a pretty simple way of achieving that which I'll implement later this week.

PTR has backing in plain memory managed by OpenGothic, but has no std::shared_ptrphoenix::instance wrapper.

How would you like to deal with that and/or what do you think you'll need from me? I don't have too much time to look into this at the moment, sorry :/

Try commented

How would you like to deal with that and/or what do you think you'll need from me?

Only raising for now, do not have great ideas on my end either. Currently experimenting with solution based on proxy-instance:

std::shared_ptr<phoenix::instance> Ikarus::mem_ptrtoinst(ptr32_t address) {
  // memory_instance is special case in VM
  return std::make_shared<memory_instance>(address);
  }
...
int Ikarus::get_mem_int(std::shared_ptr<phoenix::instance> base, symbol* sym, size_t index) {
  ptr = base->address + sym->offset + index*sym->sizeof_;
  return allocator.readInt(ptr); // map to C++ real memory and return value to VM
  }

There would be a pretty simple way of achieving that which I'll implement later this week.

No, rush. Keep in mind that both features should interact together, as script will probably allocate memory for those directly via MEM_Alloc

There would be a pretty simple way of achieving that which I'll implement later this week.

It ended up being not quite as simple as I thought. I've pushed a minimal working implementation to experiment/opaque-instances. Feel free to adjust it to your liking and add it to #74 if you want. Basic usage is:

phoenix::vm vm {...};
phoenix::register_all_script_classes(vm);

// Will actually run the initialization routine for ALL unregistered instances and assign them as an opaque object which is inaccessible to C++-code
vm.initialize_unregistered_instances_as_opaque();
Try commented

I've pushed a minimal working implementation

Cool, but please do not forget to review #74 first.
Currently I'm prototyping DMA-access to explicitly allocated classes, and need #74 as a base.

About DMA-access:

  • Ikarus/LeGo usually do allocate memory explicitly (mem_alloc) from 32bit-heap
  • Once memory-allocated it gets casted to instance via _^
  • Member access then goes as usual. This works in vanilla as member access is trivial ptr+offset pointer-load

In _HT_CREATEPTR example, I made this work by returning dummy instance:

struct Ikarus::memory_instance : public phoenix::instance {
  explicit memory_instance(ptr32_t address):address(address){}
  ptr32_t address;
  };

memory_instance has _m_symbol_index==unset, and _m_type==nullptr
In VM I've added extra logic to get/set to check for _m_symbol_index and call external callback.

Generally almost works... Doesn't work for:

  1. C_SVM - haven't look deep, but seems to be case, when data is not allocated by mem_alloc
  2. strings - have no goof solution how to manage them, as we want to work with opaque strings, but Ikarus sometimes writes directly to .data and .size, also assumes gothic2 ZString, while on that.

strings - have no goof solution how to manage them, as we want to work with opaque strings, but Ikarus sometimes writes directly to .data and .size, also assumes gothic2 ZString, while on that.

Oh that would be really bad. It would require a re-design of the entire script/vm implementation since I only support std::string which is hardcoded everywhere. I have been meaning to do that anyways but here it would be required.

In _HT_CREATEPTR example, I made this work by returning dummy instance:

I'm not sure if this is what you want. fc940fc introduces a storage_at virtual function for instances (fc940fc#diff-23d603ff3fcd5c9172ed67cd718fa24e44cd0c9a2507bbbeebde2d0c8b84400eR220-R222) to solve the problem of figuring out where the data is actually stored. Previously it would just take static_cast<char*>(instance) + offset which won't work if your instance doesn't have the correct size.

Try commented

introduces a storage_at virtual function for instances

I've evaluated this solution. Not quite something that can work for mods:
binding raw-pointer to instance seems to be dangerous. Specially I'm not sure, if it's possible to have MEM_Free, after setting instance. In that case - dangling reference inside instance.

For instances allocated by MEM_ framework storing ptr32 and calling engine for address translation is way more robust and easier.

Another(partially orthogonal case) is classes allocated in traditional way (init_instance). Example would be C_SVM in CoM:
C_SVM can be seen as opaque for engine - no interest in individual fields, only doing reflection from time-to-time.

Current idea:

  • Separate this 2 use-cases
  • provide support for MEM_-framework, similar to #79
  • provide support for plain-classes, similar to fc940fc, except no need to storage_at customization
  • Remove C_SVM, C_FightAI from builtin's fixing CoM and LHvier issues

Specially I'm not sure, if it's possible to have MEM_Free, after setting instance. In that case - dangling reference inside instance.

Hm but everything inside it is allocated using the MEM_ framework, right? So I'd think that the script is responsible for free-ing all of that memory anyways except when the engine shuts down in which case it can just be freed by the C++-implementation.

provide support for plain-classes, similar to fc940fc, except no need to storage_at customization

Maybe I'm missing something but that doesn't seem possible. Since there are many classes that might be opaque to C++ their size must be dynamic, which must lead to a heap allocation where we can't access a member of a given class by an offset from its this-pointer.

Remove C_SVM, C_FightAI from builtin's fixing CoM and LHvier issues

You can already do that by manually calling ::register_ on the classes you want (e.g. c_gil_values::register_(vm) instead of using phoenix::register_all_script_classes.

Try commented

So I'd think that the script is responsible for free-ing all

Yes, we just need to "not crash" on bad-access from script. Specially when those cases caused by lack of Ikarus features. (some objects are not mapped to script memory)

Maybe I'm missing something but that doesn't seem possible.

True. Need to look for good enough solution. My current understanding, that class-support can be broken into cases:

  1. Normal access, from script - works already
  2. Access to memory allocated by MEM_ - works via mem_traps
  3. Mapped memory.
    In G2 address of pointer oGame* seem to be fixed, and Ikarus uses so to traverse pointer-chain;
    Symbol table is also accessed (we probably don't care, yet here a few good use-case examples):
class ZCPAR_SYMBOL {
	var string NAME;
	var int NEXT;
	var int CONTENT;
	var int OFFSET;
	var int BITFIELD;
	var int FILENR;
	var int LINE;
	var int LINE_ANZ;
	var int POS_BEG;
	var int POS_ANZ;
	var int PARENT;
};

My plan was to extend mem-trap system to have non-trivial address mapping here - create illusion of g2-like strings. for plain data, also can be useful when class layout is not 1:1.
4. Access to structs allocated by MEM_.
Here we have good hook-point in mem_ptrtoinst - analog if reinterpret_cast<T*>.

BAR_HP = MEM_PTRTOINST(MEM_GAME.HPBAR);
BAR_HP.ZCVIEW_VPOSX = ...

While, writing this comment realize: BAR_HP still have to use what I call call-back memory, similar to case 3.
5. C_SVM, C_FightAI, C_GIL_VALUES - no pointer-like magic, only trivial re-declaration here.
Normally this case not interacting with pointer-magic - your opaque-class idea works well here.
6. [phoenix] vm: accessing member "C_ITEM.HP" without an instance set
Need to chaise that - apparently script allocated item, for the sake of reading it's properties

Hm, probably need to decouple it into ticket and close this one; for now I'll continue with case-study. Probably mem_trap + some extras will work good enough.