tenable/ghidra_tools

Catch edgecases of the suggested variable names

Closed this issue · 2 comments

When querying chatGPT with an example insertion sort program, it replied with

[...

Suggested variable names:
- `uVar1` -> `currentValue`: represents the current value being sorted in the array
- `iVar2` -> `currentIndex`: represents the current index being sorted in the array
- `local_3c` -> `startIndex`: represents the starting index of the array being sorted
- `local_38` -> `prevIndex`: represents the previous index being compared to the current index in the array
- `local_28` -> `array`: represents the array being sorted.
...]

While of course the extra information is helpful, the way the script currently parses it means my variables were renamed to

arrayrepresentsthearraybeingsorted[0] = 9;

etc.

Maybe, as a failsafe, it's a good idea to strip special characters, split on spaces then take the first word, to prevent stuff like this from happening.
Seems like it's happening at https://github.com/tenable/ghidra_tools/blob/main/g3po/g3po.py#L452

Ah, good catch. Not sure why, but it seems that these comments more frequently appear when using the ChatGPT models (gpt-3.5-turbo) than the now old fashioned InstructGPT models (text-davinci-003, e.g.), but you're right, I've been seeing this mess myself lately too. Should be an easy fix.

And, indeed, you've fixed it!