Issues with TheChroniclesOfMyrtana mod
Try opened this issue · 23 comments
I'm looking into KoM startup routine, to see if it's possible to have adequate support for this mod.
Before phoenix, I was able to react stating location and play tutorial segment on a ship.
Issues so far:
- Crash in
symbol::set_int
C++ side:
void symbol::set_int(std::int32_t value, std::size_t index, const std::shared_ptr<instance>& context) {
...
if (is_member()) {
...
} else {
// _m_value is null
std::get<std::unique_ptr<std::int32_t[]>>(_m_value)[index] = value;
}
}
Script call-stack:
"HOOKENGINEF"
"INIT_FEATURES_GAMESTART"
"INIT_GAMESTART"
"INIT_GLOBAL"
Apparently it's related to assignment to HOOKENGINEF.FUNCTION
(function override?)
- No error handling in
opcode::div
andopcode::mod
If script divides by zero - whole application crashes
- Crash in
allocate_instance
, when called for"$INSTANCE_HELP"
Most likely not intended by script - happens late in initialize routines, and everything is supper broke there, due to no DMA support.
auto* parent = find_symbol_by_index(sym->parent()); // parent ==nullptr here
while (parent->type() != datatype::class_) {
parent = find_symbol_by_index(parent->parent());
}
- illegal mutable access of const symbol
[phoenix] vm: internal exception: illegal mutable access of const symbol WIN_GETLASTERROR.CALL
[phoenix] vm: internal exception: illegal mutable access of const symbol KMLIB_INITIALIZEGAMESTART._KMLIB_INITIALIZEGAMESTART_PTR
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_RETVALISFLOAT
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_PUTRETVALTO
[phoenix] vm: internal exception: illegal mutable access of const symbol CALLINT_NUMPARAMS
Proposed fix: introduce vm_allow_const_cast_access
, flag similar to vm_allow_null_instance_access
- Minor issue with de-compiled sources:
Local variables were displayed as global. I'm aware, that Daedalus has no real stack-variables, so just readability issue.
// CATINV_NEXTNONACTIVEITEM.* should be local variables
var int CATINV_NEXTNONACTIVEITEM.I = 0;
instance CATINV_NEXTNONACTIVEITEM.L(ZCLISTSORT)
instance CATINV_NEXTNONACTIVEITEM.ITM(C_ITEM)
func int CATINV_LASTNONACTIVEITEM(var int LIST, var int MAX) {
...
}
- Allow for receiving symbols directly
This is required to implement control-flow functions in ikarus. Basically my plan is to:
vm.override_function("repeat", [this](phoenix::symbol* var, int count){ repeat(var,count); })
And access(and memorize)var
as a reference to loop-counter.
Proposed solution: remove check for phoenix::symbol
in check_external_params
, rest seem to work as-is
- Some stack introspection?
Followup torepeat
. All together makes workflow:
// script
repeat(i, 32)
...
end; // variable-token
// c++
void Ikarus::repeat(phoenix::symbol* var, int count) {
auto end = vm.find_symbol_by_name("end");
if(end==nullptr)
return;
auto& called_from = ...; // need to get parent context
for(uint32_t i=called_from.pc(); i<vm.size(); ) {
auto inst = vm.instruction_at(i);
if(inst.op==phoenix::opcode::pushv && inst.symbol==end->address()) {
// inject jump-back to have a loop
break;
}
i += inst.size;
}
}
PS:
Probably it's better to have one ticket instead of million small ones.
Thanks, I'll look into it. Just real quick:
Minor issue with de-compiled sources:
I do have an updated version of the decompiler up on lmichaelis/phoenix-studio. I'll redo the decompilation for tcom-decompilation in just a sec.
Done. Files are now more helpfully labeled. The code you are referencing here can now be found in Source/CATINV_NEXTNONACTIVEITEM.d:
func int CATINV_NEXTNONACTIVEITEM(var int LIST, var int MAX) {
var C_ITEM ITM;
var ZCLISTSORT L;
var int I;
I = 0;
WHILE((LIST) && ((I) < (MAX)));
L = _^(LIST);
if (HLP_IS_OCITEM(L.DATA)) {
ITM = _^(L.DATA);
if (((ITM.FLAGS) & (ITEM_ACTIVE)) != (ITEM_ACTIVE)) {
BREAK;
};
I += 1;
};
LIST = L.NEXT;
END;
return I;
}
You can find the old version of the decompilation in the old
branch.
Done. Files are now more helpfully labeled.
Just checked out new code. Apparently LeGo.d
is gone, and maybe other library files.
Yes they are now scattered throughout Source/
I have to classify them again. They should be easy to pull out though since the filenames are now actually helpful.
I've added 2 more things for control flow.
Basically need to develop a way to execute at loops:
repeat(i, 32) // ikarus-function. Here need to have a way to memorize 'i' and scout byte-code for 'end'
<body>
end; // variable-token. Currently thinking of replacing it to `be` + `b`
Here I'm planning to inject native function in replace of 'end;'
Crash in
symbol::set_int
This should be fixed now. Because the script will assign function parameters to local variables, this caused issues with higher-order functions since no memory was actually allocated for the local (here HOOKENGINEF.FUNCTION
):
Crash in allocate_instance, when called for "$INSTANCE_HELP"
Yeah, that would be a null-pointer de-reference. I've replaced with an exception :)
illegal mutable access of const symbol
This will require quite a bit of rework in the script. Currently, it's not even possible to change the value of a const
symbol manually which would probably be a good thing to have anyways. I'll have to move the const-ness check into the VM and probably create a new exception type for it.
Allow for receiving symbols directly
This already works for instances. It won't work for general variables right now since I can't verify that that external will actually be called with a variable (it could just be called with a literal).
If you'll scout the bytecode for end;
anyways, couldn't you just loop inplace without touching the variable?
Some stack introspection?
Basically, you just need access to the call-stack and a way to set the program counter, right?
Currently, it's not even possible to change the value of a const symbol
Are you sure? At least in place, where exception been hit, there is access:
void symbol::set_int(std::int32_t value, std::size_t index, const std::shared_ptr<instance>& context) {
if (is_const()) {
//throw illegal_const_access(this);
}
It won't work for general variables right now
I've checked source code, seem to be OK: vm::pop_reference()
is simply throwing exception, for immediate variables. This is more than fine for use-case. Loop counter can't be a literal anyway.
If you'll scout the bytecode for end; anyways, couldn't you just loop inplace without touching the variable?
I don't really see how... Yet, I'm welcome for ideas, because scouting byte-code is clearly not a great workflow.
For clarity plan is: implement conditional jump at repeat
and increment+jump at end. end
.
As a side idea: have a naked calls, with no stack entry, for override_function
can be an alternative. Probably a better one even:
end
can be implemented than as pushv %i
+ be[naked]
to implement +1 and jump.
Are you sure? At least in place, where exception been hit, there is access:
Yes, that's what I mean. If you try to set the value of a const variable, it will throw. That would have to change, of course, as I mentioned.
Loop counter can't be a literal anyway.
In this case, yes. That's not safe everywhere though. Maybe there is some way around making function overrides unsafe like that though, not sure.
I don't really see how...
If you override repeat
, then you'll get control at the address. You could then scout for end;
and manually execute the instructions in between (of course, this would require access to vm::exec()
or some way of running code until a given pc is reached). You just repeat that until i == 0
. No bytecode tinkering necessary. That would be difficult to do anyways, considering that jumps reference a fixed byte offset from the beginning of the script's code section.
If you override repeat, then you'll get control at the address
I need to think about it. So far not going smooth: repeat
is still a function call, so pc
state will be restored after call is complete.
You just repeat that until i == 0
Nit: other way around. repeat(i,32)
is equivalent to for(; i<32; ++i)
.
So far naked call, with no stack frame seem to be a good versatile solution.. except they need access to vm-variables stack to be usefull...
Testing latest version of phenix
. Apparently phoenix::vm::init_instance
is broke, always throwing exception:
1 libstdc++-6!.cxa_throw
2 std::__throw_bad_variant_access variant 1305
3 std::__throw_bad_variant_access variant 1313
4 std::get<3ull, std::unique_ptr<...>>
5 std::get<std::shared_ptr<phoenix::instance>, std::unique_ptr<...>>
6 phoenix::symbol::get_instance script.cc 415
7 phoenix::vm::init_instance<phoenix::c_spell> vm.hh 210
8 phoenix::vm::init_instance<phoenix::c_spell> vm.hh 194
9 operator() spelldefinitions.cpp 12
10 std::__invoke_impl<void, SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>&, phoenix::symbol&>(std::__invoke_other, struct {...} &) invoke.h 61
11 std::__invoke_r<void, SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>&, phoenix::symbol&>(struct {...} &) invoke.h 111
12 std::_Function_handler<void(phoenix::symbol&), SpellDefinitions::SpellDefinitions(phoenix::vm&)::<lambda(phoenix::symbol&)>>::_M_invoke(const std::_Any_data &, phoenix::symbol &) std_function.h 291
13 std::function<void (phoenix::symbol&)>::operator()(phoenix::symbol&) const std_function.h 560
14 phoenix::script::enumerate_instances_by_class_name(std::basic_string_view<char, std::char_traits<char>>, std::function<void (phoenix::symbol&)> const&) script.cc 189
15 SpellDefinitions::SpellDefinitions spelldefinitions.cpp 10
16 std::make_unique<SpellDefinitions, phoenix::vm&> unique_ptr.h 962
17 GameScript::initCommon gamescript.cpp 283
18 GameScript::GameScript gamescript.cpp 73
19 GameSession::GameSession gamesession.cpp 124
20 operator() mainwindow.cpp 898
...
Seems symbol::_m_type
somehow become inconsistent with symbol::_m_value
Offended commit: f0d6751
Whoops. And that's why I need script tests. It was an incorrect switch fall-through. Fixed in dc0e340.
You just repeat that until i == 0
Nit: other way around.
repeat(i,32)
is equivalent tofor(; i<32; ++i)
.So far naked call, with no stack frame seem to be a good versatile solution.. except they need access to vm-variables stack to be usefull...
So after thinking about the symbol*
parameter thing for a bit, I think that there is no way around it. I'll implement it if you think that's the way to go. As for my idea on how to implement repeat
even without naked calls, see below:
(Daedalus code)
func void do_something() {
var int i = 0;
repeat(i, 10);
wld_settime(10, i); // or whatever
end;
return;
}
(C++ code)
void setup(phoenix::vm& vm) {
// misc vm setup ...
vm.override_function("repeat", [&vm](phoenix::symbol* sym, int n) {
// introspect call stack
auto caller = vm.caller();
// scout bytecode for END; (you might want to cache the result)
auto pc = caller.pc;
auto instr_count = 0;
for (auto i = 0u; i < 50 /* arbitrary limit */; ++i) {
auto instr = vm.instruction_at(pc);
// either check for symbol index directly ...
if (instr.op == phoenix::opcode::pushv && instr.symbol == 0x165b) {
break;
}
// ... or for symbol name
if (instr.op == phoenix::opcode::pushv) {
auto* sym = vm.find_symbol_by_index(instr.symbol);
if (sym != nullptr && sym.name == "END") {
break;
}
}
pc += instr.size();
instr_count += 1;
}
// run the original instructions n - 1 times
for (int i = sym->get_int(); i < n - 1; ++i) {
sym->set_int(i);
vm.unsafe_exec_from(caller.pc, instr_count);
}
sym->set_int(n - 1);
// exit; the bytecode after repeat() will run one more time
});
}
Obviously, the APIs vm::unsafe_exec_from()
and vm::caller()
don't currently exist but they would be easily implemented.
As for my idea on how to implement repeat even without naked calls, see below:
Hm, can't say that I like it:
- Not clear how-to handle
break
andcontinue
, sincevm.unsafe_exec_from(caller.pc, instr_count)
is complete back-box. - Not clear how-to deal with zero-sized loop "the bytecode after repeat() will run one more time"
- In principle, similar take can be done already, by inheritance, from
phoenix::vm
and accessingexec
,push
/pop
and etc
Naked call also do suffer from inability to use push
/pop
directly.
You're right I totally forgot about continue;
and break;
. With those in might, I believe sub-classing the VM is actually a good idea. Here's my new take.
Things considered
repeat()
continue;
break;
end;
The idea
Let's subclass phoenix::vm
and override push_reference
. Also use override_function
to override repeat()
using a symbol*
to reference the loop variable. Example (only supports a single nested loop):
class my_vm : phoenix::vm {
public:
my_vm(phoenix::script&& s) : phoenix::vm(std::forward(s), /* flags ... */) {
this->override_function("repeat", [this](phoenix::symbol* s, int n) {
return this->repeat(s, n);
});
}
void push_reference(phoenix::symbol* sym, uint8_t i) override final {
if (sym->name() == "END" && _m_loop_begin) {
this->_m_loop_end = this->pc();
this->jump(*_m_loop_begin);
return;
}
if (sym->name() == "CONTINUE" && _m_loop_begin) {
this->jump(*_m_loop_begin);
return;
}
if (sym->name() == "BREAK") {
this->jump(this->find_loop_end());
// reset
_m_loop_begin = std::nullopt;
_m_loop_end = std::nullopt;
return;
}
phoenix::vm::push_reference(sym, i);
}
private:
uint32_t find_loop_end() {
// discover `pushv END` like in my last comment and assign it to _m_loop_end
return *_m_loop_end;
}
phoenix::naked_call repeat(phoenix::symbol* i, int count)
if (!_m_loop_begin) {
// save loop start
_m_loop_begin = this->caller().pc;
}
if (i->get_int() >= count) {
// jump to loop end
this->jump(this->find_loop_end());
// reset
_m_loop_begin = std::nullopt;
_m_loop_end = std::nullopt;
return {};
}
// increment i and continue loop
i->set_int(i->get_int() + 1);
this->jump(this->caller().pc);
return {};
}
private:
std::optional<uint32_t> _m_loop_begin, _m_loop_end;
};
Obviously there are some optimizations possible here (like looking up the symbol indices of CONTINUE
, BREAK
and END
beforehand) and it only supports one nesting level but those things can be implemented fairly easily. This would also give you the power to change implementation details as you like.
I'm not sure how this process would work for other custom Union constructs but it seems plausible that something like this could work too. It's also quite fast since you only have to scout the bytecode for END
if you never reach it during loop execution which should be relatively rare.
Changes needed in phoenix
- Function overrides, conditionally without PC-restoration ("naked calls")
- External functions and function overrides can use
symbol*
for any of its arguments - Some functions of the VM marked
virtual
- VM function
caller()
I've pushed prototype for loops support.
CONTINUE
, BREAK
and END
are tracked in VM (guarded by flag), assuming that it's more efficient than virtual
functions based approach.
Added unsafe_jump
function and naked calls to assist with repeat
function.
Already done some testing locally - application side is clean and simple loops do work. Testing of CONTINUE
, BREAK
is todo.
One more big feature to discuss is dynamically allocated memory. Some ikarus code for context:
instance _HT_CREATEPTR.ARR(ZCARRAY)
func int MEM_ARRAYCREATE() {
return MEM_ALLOC(SIZEOF_ZCARRAY);
}
func int _HT_CREATEPTR(var int SIZE) {
PTR = MEM_ARRAYCREATE();
ARR = _^(PTR);
ARR.ARRAY = MEM_ALLOC((SIZE) * (4));
ARR.NUMALLOC = (SIZE) * (4);
ARR.NUMINARRAY = 0;
return PTR;
}
OpenGothic can hook into MEM_Alloc
/MEM_Free
and emulate 32-bit allocator.
However _^(PTR)
is non-trivial. PTR
has backing in plain memory managed by OpenGothic, but has no std::shared_ptr<phoenix::instance>
wrapper.
Another issue is class-declaration. Class ZCARRAY
is defined by script and engine should not be aware of it. So need some solution to threat some classes as plain-structs, with no per-field bindings
Another issue is class-declaration. Class ZCARRAY is defined by script and engine should not be aware of it. So need some solution to threat some classes as plain-structs, with no per-field bindings
There would be a pretty simple way of achieving that which I'll implement later this week.
PTR has backing in plain memory managed by OpenGothic, but has no std::shared_ptrphoenix::instance wrapper.
How would you like to deal with that and/or what do you think you'll need from me? I don't have too much time to look into this at the moment, sorry :/
How would you like to deal with that and/or what do you think you'll need from me?
Only raising for now, do not have great ideas on my end either. Currently experimenting with solution based on proxy-instance:
std::shared_ptr<phoenix::instance> Ikarus::mem_ptrtoinst(ptr32_t address) {
// memory_instance is special case in VM
return std::make_shared<memory_instance>(address);
}
...
int Ikarus::get_mem_int(std::shared_ptr<phoenix::instance> base, symbol* sym, size_t index) {
ptr = base->address + sym->offset + index*sym->sizeof_;
return allocator.readInt(ptr); // map to C++ real memory and return value to VM
}
There would be a pretty simple way of achieving that which I'll implement later this week.
No, rush. Keep in mind that both features should interact together, as script will probably allocate memory for those directly via MEM_Alloc
There would be a pretty simple way of achieving that which I'll implement later this week.
It ended up being not quite as simple as I thought. I've pushed a minimal working implementation to experiment/opaque-instances. Feel free to adjust it to your liking and add it to #74 if you want. Basic usage is:
phoenix::vm vm {...};
phoenix::register_all_script_classes(vm);
// Will actually run the initialization routine for ALL unregistered instances and assign them as an opaque object which is inaccessible to C++-code
vm.initialize_unregistered_instances_as_opaque();
I've pushed a minimal working implementation
Cool, but please do not forget to review #74 first.
Currently I'm prototyping DMA-access to explicitly allocated classes, and need #74 as a base.
About DMA-access:
- Ikarus/LeGo usually do allocate memory explicitly (
mem_alloc
) from 32bit-heap - Once memory-allocated it gets casted to instance via
_^
- Member access then goes as usual. This works in vanilla as member access is trivial ptr+offset pointer-load
In _HT_CREATEPTR
example, I made this work by returning dummy instance:
struct Ikarus::memory_instance : public phoenix::instance {
explicit memory_instance(ptr32_t address):address(address){}
ptr32_t address;
};
memory_instance
has _m_symbol_index==unset
, and _m_type==nullptr
In VM I've added extra logic to get
/set
to check for _m_symbol_index
and call external callback.
Generally almost works... Doesn't work for:
C_SVM
- haven't look deep, but seems to be case, when data is not allocated bymem_alloc
- strings - have no goof solution how to manage them, as we want to work with opaque strings, but Ikarus sometimes writes directly to
.data
and.size
, also assumes gothic2 ZString, while on that.
strings - have no goof solution how to manage them, as we want to work with opaque strings, but Ikarus sometimes writes directly to .data and .size, also assumes gothic2 ZString, while on that.
Oh that would be really bad. It would require a re-design of the entire script/vm implementation since I only support std::string
which is hardcoded everywhere. I have been meaning to do that anyways but here it would be required.
In _HT_CREATEPTR example, I made this work by returning dummy instance:
I'm not sure if this is what you want. fc940fc introduces a storage_at
virtual function for instances (fc940fc#diff-23d603ff3fcd5c9172ed67cd718fa24e44cd0c9a2507bbbeebde2d0c8b84400eR220-R222) to solve the problem of figuring out where the data is actually stored. Previously it would just take static_cast<char*>(instance) + offset
which won't work if your instance doesn't have the correct size.
introduces a storage_at virtual function for instances
I've evaluated this solution. Not quite something that can work for mods:
binding raw-pointer to instance seems to be dangerous. Specially I'm not sure, if it's possible to have MEM_Free
, after setting instance. In that case - dangling reference inside instance.
For instances allocated by MEM_
framework storing ptr32
and calling engine for address translation is way more robust and easier.
Another(partially orthogonal case) is classes allocated in traditional way (init_instance
). Example would be C_SVM
in CoM:
C_SVM
can be seen as opaque for engine - no interest in individual fields, only doing reflection from time-to-time.
Current idea:
Specially I'm not sure, if it's possible to have MEM_Free, after setting instance. In that case - dangling reference inside instance.
Hm but everything inside it is allocated using the MEM_
framework, right? So I'd think that the script is responsible for free-ing all of that memory anyways except when the engine shuts down in which case it can just be freed by the C++-implementation.
provide support for plain-classes, similar to fc940fc, except no need to storage_at customization
Maybe I'm missing something but that doesn't seem possible. Since there are many classes that might be opaque to C++ their size must be dynamic, which must lead to a heap allocation where we can't access a member of a given class by an offset from its this
-pointer.
Remove C_SVM, C_FightAI from builtin's fixing CoM and LHvier issues
You can already do that by manually calling ::register_
on the classes you want (e.g. c_gil_values::register_(vm)
instead of using phoenix::register_all_script_classes
.
So I'd think that the script is responsible for free-ing all
Yes, we just need to "not crash" on bad-access from script. Specially when those cases caused by lack of Ikarus features. (some objects are not mapped to script memory)
Maybe I'm missing something but that doesn't seem possible.
True. Need to look for good enough solution. My current understanding, that class-support can be broken into cases:
- Normal access, from script - works already
- Access to memory allocated by
MEM_
- works viamem_traps
- Mapped memory.
In G2 address of pointeroGame*
seem to be fixed, and Ikarus uses so to traverse pointer-chain;
Symbol table is also accessed (we probably don't care, yet here a few good use-case examples):
class ZCPAR_SYMBOL {
var string NAME;
var int NEXT;
var int CONTENT;
var int OFFSET;
var int BITFIELD;
var int FILENR;
var int LINE;
var int LINE_ANZ;
var int POS_BEG;
var int POS_ANZ;
var int PARENT;
};
My plan was to extend mem-trap system to have non-trivial address mapping here - create illusion of g2-like strings. for plain data, also can be useful when class layout is not 1:1.
4. Access to structs allocated by MEM_
.
Here we have good hook-point in mem_ptrtoinst
- analog if reinterpret_cast<T*>
.
BAR_HP = MEM_PTRTOINST(MEM_GAME.HPBAR);
BAR_HP.ZCVIEW_VPOSX = ...
While, writing this comment realize: BAR_HP
still have to use what I call call-back memory, similar to case 3
.
5. C_SVM
, C_FightAI
, C_GIL_VALUES
- no pointer-like magic, only trivial re-declaration here.
Normally this case not interacting with pointer-magic - your opaque-class idea works well here.
6. [phoenix] vm: accessing member "C_ITEM.HP" without an instance set
Need to chaise that - apparently script allocated item, for the sake of reading it's properties
Hm, probably need to decouple it into ticket and close this one; for now I'll continue with case-study. Probably mem_trap + some extras will work good enough.