Ghidra processor module for Toshiba TLCS-900/H.
Tested with version 10.3.2.
Work in progress:
- Instruction set should be fully disassembled, if you notice anything missing, please open an issue / pull request!
- Note: Disassemblers usually disagree on presentation / ambiguity:
- Disabled interrupts:
di
vs.ei 0x07
; - Mnemonic qualifiers for word/long sized operands:
sll
vs.sllw
; - Flag shown for conditional instructions:
jp NZ,XWA
vs.jp NE,XWA
; - Destination register shown when result storage space is larger than loaded values:
mul A,(XWA)
vs.mul WA,(XWA)
; - Accepting decoding of extra cases that aren't documented:
ldir
andldirw
can be decoded with 5 other combinations of bits 0..2;
- Disabled interrupts:
- Note: Disassemblers usually disagree on presentation / ambiguity:
- Semantics are mostly done, and are now being thoroughly tested through pcode emulation.
GHIDRA_INSTALL_DIR=#FIXME
mkdir -p "$GHIDRA_INSTALL_DIR/Ghidra/Processors/TLCS900H"
cp -r . "$GHIDRA_INSTALL_DIR/Ghidra/Processors/TLCS900H"
See ghidra-neogeopocket-loader.
The following sections describe how to verify the processor module's correctness. These can be tested independently.
First, we need a dataset that guarantees "good enough" coverage. 2 approaches can be combined to generate it:
Approach #1: Extracting instructions from Neo Geo Pocket ROMs, using an existing disassembler (e.g. MAME's unidasm):
MAME_BUILD_DIR=#FIXME
for i in roms/*; do
# We only care for (valid) instructions that are 4 or more bytes long,
# since we can feasibly bruteforce smaller ones
"$MAME_BUILD_DIR/unidasm" "$i" -arch tlcs900 \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| sed 's/^[ \t]*//g' \
| grep '^ *.. .. .. .* ' \
| grep -v ' db$' \
| sort -u \
>> 4to7out_ngp
done
Approach #2: Generating candidates by bruteforcing instruction decoding. Up to 3 byte instructions are covered in a few minutes, everything else is generated using a subset of all possibilities. MAME's unidasm is particularly fit for this, since it will expand parsed bytes with dummy bytes, until an instruction with a given prefix can be decoded. We can take these cases into consideration when generating candidates that are 4 or more bytes longer:
./scripts/dis/1n2.py \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. db' \
| grep -v '^ *.. db' \
| sort -u \
> 1n2out
./scripts/dis/3.py \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. .. db' \
| sort -u \
> 3out
cat 1n2out 3out | sort -u | grep '^ .. .. .. .. ' > 4in
./scripts/dis/expand.py 4in 4 \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. .. .. db' \
| sort -u \
> 4out
./scripts/dis/clean.sh 4out 4
cat 1n2out 3out 4out_clean | sort -u | grep '^ *.. .. .. .. .. ' > 5in
./scripts/dis/expand.py 5in 5 \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. .. .. .. db' \
| sort -u \
> 5out
./scripts/dis/clean.sh 5out 5
cat 1n2out 3out 4out_clean 5out_clean | sort -u | grep '^ *.. .. .. .. .. .. ' > 6in
./scripts/dis/expand.py 6in 6 \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. .. .. .. .. db' \
| sort -u \
> 6out
./scripts/dis/clean.sh 6out 6
cat 1n2out 3out 4out_clean 5out_clean 6out_clean | sort -u | grep '^ *.. .. .. .. .. .. .. ' > 7in
./scripts/dis/expand.py 7in 7 \
| sed 's/ */ /g' \
| cut -d':' -f2- \
| grep -v '^ *.. .. .. .. .. .. .. db' \
| sort -u \
> 7out
./scripts/dis/clean.sh 7out 7
cat 1n2out 3out 4out_clean 5out_clean 6out_clean 7out_clean | sed 's/^[ \t]*//g' | sort -u > 1to7out
Now, we can reassemble this dataset, then attempt to disassemble it again with Ghidra:
# Combine both datasets into one, skipping invalid instructions
sort -u 1to7out 4to7out_ngp \
| grep -v -e '(),' -e ',()' -e '(unknown)' -e 'r[0-9A-F][0-9A-F][WABCDEHL]' \
> out
# Take instruction bytes and write binary
./scripts/dis/dis2bin.py out out.bin
# Disassemble with Ghidra.
# Terminates early if an instruction could not be decoded at a given offset,
# either because it's missing or incorrect in our SLEIGH specification.
# Results are saved in ~/export/1.
GHIDRA_INSTALL_DIR=#FIXME
GHIDRA_PROJECT_DIR=#FIXME
mkdir -p ~/export/
rm -f ~/export/1 && "$GHIDRA_INSTALL_DIR/support/analyzeHeadless" "$GHIDRA_PROJECT_DIR" tlcs900h_headless \
-import out.bin \
-noanalysis \
-overwrite \
-processor 'TLCS900H:LE:32:default' \
-scriptPath ./scripts/dis/ \
-postScript ana2dis.py \
~/export/1
Finally, we can manually compare results between Ghidra vs. MAME:
diff -uw \
<(sed 's/ */ /g; s/_\([0-9]\)/\1/g; s/_P/-1/g; s/0x0*\([0-9a-f]\)/\1/g' ~/export/1) \
<(sed 's/ */ /g; s/0x0*\([0-9a-f]\)/\1/g' out) \
| less
These tests compare trace logs generated by Ghidra vs. a modified version of ares (patch for commit c4a5fcea9), which includes additional logging and dumping features, toggled with environment variables:
ARES_START
: When the given hexadecimal address is reached, dumps memory and registers to files, to be loaded in Ghidra before stepping through instructions;ARES_END
: Similar toARES_START
, but exits after dumping memory;ARES_WATCH
: Dereferences and prints the value of the given address whenever it is modified;
First, start ares:
env ARES_START=0x2000e4 ARES_END=0x2009ca ~/opt/ares/desktop-ui/out/ares --system ngpc /media/fn/TOSHIBA-EXT/FN-NUX/cputest.ngc
Make sure you include these hardware events in trace logs:
After target addresses have been reached, process and move all state files to the working directory defined in the Ghidra emulation script:
mv cputest-20230817-113623.log 0x00200046.cputest-20230817-113623.log
mkdir -p ~/code/wip/tlcs900h/tmp/
for i in *.log; do
addr=$(echo "$i" | grep '^0x' | cut -d'.' -f1)
test -n "$addr" || continue
sed -i '/APU I\/O: read/d; /APU Interrupt/d;' "$i"
grep -n 'CPU Interrupt' "$i" | sed 's/^\([0-9]*\):.*:.*(\(.*\))$/\1,\2/g' > ~/code/wip/tlcs900h/tmp/"$addr".int
grep -n 'CPU I/O' "$i" | sed 's/^\([0-9]*\):.*:[ \t]*\(.*\)$/\1,\2/g' > ~/code/wip/tlcs900h/tmp/"$addr".io
grep -n 'APU I/O' "$i" | sed 's/^\([0-9]*\):.*:[ \t]*\(.*\)$/\1,\2/g' > ~/code/wip/tlcs900h/tmp/"$addr".apu.io
grep -n 'APU ' "$i" | sed 's/^\([0-9]*\):[ \t]*\(.*\)$/\1,\2/g' > ~/code/wip/tlcs900h/tmp/"$addr".apu
done
mv /tmp/0x* ~/code/wip/tlcs900h/tmp/
Emulate with Ghidra:
GHIDRA_INSTALL_DIR=#FIXME
GHIDRA_PROJECT_DIR=#FIXME
mkdir -p ~/export/
rm -f ~/export/2 && "$GHIDRA_INSTALL_DIR/support/analyzeHeadless" "$GHIDRA_PROJECT_DIR" tlcs900h_headless \
-import roms/CPU_Test_199x_Judge_PD.bin \
-overwrite \
-processor 'TLCS900H:LE:32:default' \
-scriptPath ./scripts/emu/ \
-postScript TLCS900Emu.java \
~/export/2
Finally, compare output trace log against the one from ares:
./scripts/emu/tracediff.py ./0x00200046.cputest.emu.log ./0x00200046.cputest-20230817-113623.log
Example output from CPU Test by Judge, where we start emulating at the program's entrypoint, and all registers matched every stepped instruction, passing all tests:
It was a bit challenging to map some memory addressing modes, not that they go off the beaten path much, but the particular way Toshiba decided to encode them:
So far so good, now let's look at how they get included in an instruction:
Usually, variable length tokens like r32b_eam
sit at the tail end of an instruction, but here, they happen to be in the middle, followed by 0x38
+ an 8/16-bit immediate value imm8/imm16
. SLEIGH expects variable tokens at the end of an instruction, which means that constructors had to be duplicated for each addressing mode with a distinct length. TODO: Could context vars avoid this duplication?