Python Bot Question
Opened this issue · 28 comments
First I'd like to apologize for creating an issue to get in contact with you, but I couldn't find another way to do so.
I've been learning python and I've also been playing Super Hexagon recently, so I figured I would try my hand at creating a bot for it.
I started out trying to use SimpleCV to parse the shapes and control the player position based on that, but SimpleCV just can't do it fast enough. It's only able to screenshot the game and parse the positions of everything at 6 FPS which is not fast enough to keep up with the speed of Super Hexagon.
I've decided to try a similar technique to what you've done by reading the memory but I've never done something like this with python.
I've found this about reading memory in python:
http://stackoverflow.com/questions/1794579/how-can-i-read-the-memory-of-another-process-in-python-in-windows
I was wondering if you have any insight as to whether this would work or not. If so, I would assume that on the stackoverflow page that "pid" would be the superhexagon process and would the "address" location be the offsets that I've seen in your bot?
So that stack overflow code does work, but it seems that either I'm trying to put the wrong memory addresses in or the base pointer and offsets in your code don't work for me.
For instance, the base pointer in your code is: 0x694B00
and the offset for number of slots is: 0x1BC
which would give a location of: 0x694CBC
However, when I went through a memory scanning process to find the number of walls, it was located at the address: 0x3A69574
So now I'm a bit lost as to how to find the memory locations of the other pieces of information I would need such as PlayerAngle, WorldAngle, FirstWall, etc...
I did some research on memory and pointers since I had not worked with them before and was finally able to figure out what I was doing wrong.
Hey,
Sorry I didn't reply earlier; I totally missed it. The number of walls is in several memory locations, so it is very likely that you will find it at a different address. Unless the game received a patch, these addresses should still work though.
As I read it, it seems to be working for you now? Otherwise, feel free to reopen this issue (or create a new one) to ask away.
Yes, they are working for me now, I was using them in an incorrect way at first. Do you happen to know a way to get the base address of a process in python? I currently have to open Super Hexagon, open the process in cheat engine, and put the base pointer (694B00) in to get the base address which I manually put into my python program before I run it.
I found this stackoverlfow page, but it seems GetModuleHandle only works if the calling process also loaded the module in question.
http://stackoverflow.com/questions/13045864/python-how-to-get-the-start-base-address-of-a-process
What do you need the base address of the Super Hexagon process for? The superhexagon.exe binary I get from Steam has ASLR disabled, so it should always be loaded at the same address.
In any case, you can use the Toolhelp32 functions to get that information. Functions of interest are CreateToolhelp32Snapshot, Process32First, Process32Next, Module32First and Module32Next.
With these Windows API functions, you can iterate over all processes to find the target process, and then iterate over all modules in that process to find the superhexagon.exe module.
Well I suppose I'm still a bit confused about the subject then. When I close and reopen Super Hexagon sometimes the memory addresses change. For instance, when I was searching for NumSlots, the first address I got was like 03A69574
but then a day or so later I was working on my python bot and the address wasn't working, when I went to scan for it again I discovered it was no longer at that address.
Oh I see. Indeed, these things are usually dynamically allocated and may be at different addresses in successive executions.
However, ASLR (address space layout randomization) is disabled and the game stores a reference to the 'app' at a known location that will never change.
You can do something along these lines (pseudocode):
appBase := ReadInt32(0x694B00) # the value of 'appBase' might be different after every execution, but we don't care; we know that at appBase + offset we can find what we want
numSlots := ReadInt32(appBase + 0x1BC)
print(numSlots)
You will always need to perform the read from address 0x694B00 to find where the 'app' has been allocated, and then use the other offsets relative to that read out value (appBase).
Instead of attaching Cheat Engine and reading out the value that you get from 0x694B00, you can do that in your bot, and then use this as the app base to get the information you need.
If this doesn't clear things up I'll try to whip up some python code to illustrate this better.
So is reading the base pointer (0x694B00) supposed to be done with the ReadProcessMemory function? Because I was under the impression that the pointer was only accessible inside Super Hexagon's address space. (I've never worked with memory directly before, or lower level programming languages.)
Yes - ReadProcessMemory can be used to read memory from other processes.
If our code were to be executed inside Super Hexagon process, we could simply dereference the pointer to read out the value it points to. Since we are outside of the process though, there is no way for us to directly interact with a different process' virtual memory - that would be catastrophic. Windows provides us with an interface to do this though: Read/WriteProcessMemory (assuming we have the proper privilege levels).
Maybe this will help you:
from ctypes import *
from ctypes.wintypes import *
from time import sleep
OpenProcess = windll.kernel32.OpenProcess
ReadProcessMemory = windll.kernel32.ReadProcessMemory
FindWindowA = windll.user32.FindWindowA
GetWindowThreadProcessId = windll.user32.GetWindowThreadProcessId
#
# Warning: absolutely no error checking, leaking handles, ..
#
# Find the Super Hexagon window by title (any window with that title might be returned):
windowHandle = FindWindowA(0, 'Super Hexagon')
# Get the process id from the window handle:
processId = c_int()
GetWindowThreadProcessId(windowHandle, byref(processId))
# Open process for reading & writing:
processHandle = OpenProcess(0x0010 | 0x0020 | 0x0008, 0, processId.value)
# Read out the app base (this is not the program / module base, the 'app' is just a huge object):
appBase = c_int()
numRead = c_int()
ReadProcessMemory(processHandle, 0x694B00, byref(appBase), 4, byref(numRead))
# Read out player angle:
playerAngle = c_int()
while True:
ReadProcessMemory(processHandle, appBase.value + 0x2958, byref(playerAngle), 4, byref(numRead))
print 'Player angle: %3d' % playerAngle.value
sleep(1.0)
As a reference, in my original code I read out the appBase with ReadProcessMemory (well, a function that wraps around it) as well: https://github.com/zku/SuperHaxagon/blob/master/SuperHaxagon.cpp#L146
Ah I see now, it makes sense now. Thank you very much for your help, I think I should be good to go now.
Wonderful, good luck!
(and keep me posted on your bot progress if you don't mind; I'd love to see one that doesn't cheat / teleport around like mine does)
Will do. I really wish I could have gotten SimpleCV to work, it seemed like a harder challenge than reading memory values. It was recognizing the center polygon, player, and lanes, but was only doing it at 6 FPS.
I'm a bit confused as to what information numWalls and firstWall give, I was watching the memory values as I played the game and I couldn't understand what they mean. For instance, a shape would appear and numWalls would change to something like 20.
firstWall is the first loaded wall that can be read out from memory; it says nothing about the walls position or distance, and numWalls tells you how many walls are currently loaded in memory. Note that the actual game field is much larger than what is visible, and more walls are loaded than you can see. I hope this picture clears some things up: https://www.dropbox.com/s/b3arjxy41hbtnea/vis.png
I've added some comments in the source file about the width of a wall. Also note that the width of a wall starts out at some fixed size (per wall), and once the wall reaches the center, it slowly loses width until it has width 0 (or a negative value). Same goes for the distance: the wall spawns at some distance from the center and slowly goes towards 0 (or negative values). A wall with a distance <= 25 (or so, don't remember exactly) will not cause any collisions anymore, and neither will a wall with a width of <= 0.
I wrote a small script in MATLAB to illustrate this by taking a snapshot of the entire game state every couple of milliseconds that could further illustrate this, but I can't provide a recording of this right now / this weekend, so if you are still stuck I'll try to upload it sometime Sunday evening.
I think most of my confusion is just coming from what data type I should be using to read/store the walls.
In your code you use struct Wall
and then you make it into a vector on line 141 with std::vector<Wall> walls;
. Would the analogous python type be a list?
The only other thing I don't fully understand is line 241: std::vector<DWORD> minDistances(numSlots, -1);
I understand it's creating an array of DWORD types, but what is minDistances? A function of some sort?
I apologize for all my questions, this is my first time reading C++ code. I'm in no hurry for a response, so please take your time if you're busy.
In python, you could define your own data structure for the walls and just perform large enough reads and manually convert the bytes to the respective fields (mostly just int
s). You could also make use of the ctypes
library and define and define the structure layout accordingly: http://docs.python.org/2/library/ctypes.html#ctypes.Structure
Note that I don't make it into a vector on line 141; I declare a vector (think array for now) of multiple walls.
The minDistances
is the name of the vector, similar to the construct you saw above. I simply provide the object constructor with more information: I tell it to create a vector of size numSlots
and have them all initialized to -1. I use that vector to find out how close the nearest wall is in every slot.
Here is the MATLAB illustration I mentioned: https://www.dropbox.com/s/pcrij4unir1rscz/illustration.mp4
I've just gotten back around to working on my project and I was going to implement the portion that would read/store the walls. Like you suggested above, I was going to make a data structure for the walls and convert the bytes to their respective fields, but I'm not sure how I should determine the size of the read to use.
In your code on line 165 you use sizeof(Wall) * numWalls)
but I don't know if python has a function similar to sizeof
.
I assume since you say that the size of Wall should be 0x14 bytes total that I could just multiply the decimal number 20 by the value from numWalls()
?
The last question I have is what the purpose of some of the fields in your Wall struct are for.
Your Wall struct is defined as:
struct Wall
{
DWORD slot;
DWORD distance; // Should be signed integer as well!
BYTE enabled; // Actually, these 4 bytes make up a signed integer containing the width of a wall!
BYTE fill1[3]; // Actually, these 4 bytes make up a signed integer containing the width of a wall!
DWORD unk2;
DWORD unk3;
};
You showed me in an illustration that explained some of it, but why are there two fields that both seem to hold the width? Is it that you only use the enabled
field and fill1[3]
is just to make it into 4 bytes?
I assume unk2, unk3 stand for 'unknown' and that you don't use them except for the purpose of getting the correct size from the Wall struct?
Yes, you could read it all out in one big read of size numWalls * 0x14. The reason for the unk2 and unk3 fields is exactly that: I didn't bother finding out what they are as they were irrelevant for my purposes - I simply know the full size of the wall structure because that was evident from reversing the code.
About enabled/fill1 and width: I misinterpreted the data in the struct. There is no such thing as an enabled flag; it's actually the width (which is 4 bytes long, a signed integer). So instead of having a 1-byte enabled flag and 3 filler-bytes, we just have the 4 bytes long width.
I'm having some trouble with reading the walls data with the large enough read we were talking about.
My current code is here: https://github.com/Darkman802/super-hexagon-bot/blob/master/bot.py
I call the read_walls() function on line 82. It calculates the read length by doing 0x14 * self.get_num_walls()
and passes that length with the memory address to the read_bytes function on line 36.
The problem I'm having is that ReadProcessMemory is only returning the current value of first_wall.
I have a function called show_data on line 161 that I was using to try and show some information that might be helpful for me to debug the problem. Each output has three lines:
- Game information
- memory address, read length
- data returned by read_walls, the list that would have the large read split into separate walls
It was outputting the following while I played the game for a few seconds:
{'num_slots': 6, 'world_angle': 112, 'player_slot': 0.7, 'player_angle': 44, 'num_walls': 9, 'first_wall': 5}
60845432 180
b'\x05' [b'\x05']
{'num_slots': 6, 'world_angle': 80, 'player_slot': 0.7, 'player_angle': 44, 'num_walls': 13, 'first_wall': 5}
60845432 260
b'\x05' [b'\x05']
{'num_slots': 6, 'world_angle': 49, 'player_slot': 0.7, 'player_angle': 44, 'num_walls': 14, 'first_wall': 5}
60845432 280
b'\x05' [b'\x05']
{'num_slots': 6, 'world_angle': 18, 'player_slot': 0.7, 'player_angle': 44, 'num_walls': 14, 'first_wall': 5}
60845432 280
b'\x05' [b'\x05']
{'num_slots': 6, 'world_angle': 346, 'player_slot': 0.7, 'player_angle': 44, 'num_walls': 14, 'first_wall': 0}
60845432 280
b'' []
{'num_slots': 6, 'world_angle': 318, 'player_slot': 1.9, 'player_angle': 114, 'num_walls': 14, 'first_wall': 0}
60845432 280
b'' []
{'num_slots': 6, 'world_angle': 284, 'player_slot': 2.8, 'player_angle': 170, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 253, 'player_slot': 3.3, 'player_angle': 198, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 222, 'player_slot': 3.5, 'player_angle': 212, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 190, 'player_slot': 3.5, 'player_angle': 212, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 159, 'player_slot': 4.8, 'player_angle': 289, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 128, 'player_slot': 5.9, 'player_angle': 352, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 149, 'player_slot': 1.2, 'player_angle': 69, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 181, 'player_slot': 2.2, 'player_angle': 132, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 212, 'player_slot': 5.8, 'player_angle': 345, 'num_walls': 25, 'first_wall': 0}
60845432 500
b'' []
{'num_slots': 6, 'world_angle': 243, 'player_slot': 5.5, 'player_angle': 331, 'num_walls': 13, 'first_wall': 0}
60845432 260
b'' []
Hmhm, do you ever adjust the buffer length? self.buffer = c_char_p(b"data buffer")
I don't have too much experience with the ctypes module, but I'm assuming this buffer won't automagically resize, in which case it will only be able to hold a couple of bytes (len("data buffer")). Also, c_char_p is probably the wrong data type for this:
class ctypes.c_char_p
Represents the C char * datatype when it points to a zero-terminated string. For a general character pointer that may also point to binary data, POINTER(c_char) must be used. The constructor accepts an integer address, or a string.
Do you have any suggestions for the correct data type, or a different approach to use? I'm not sure where to go from here.
This seems to work. Disclaimer: I don't really know how to use the ctypes module. No idea if these ctypes data types leak if they aren't freed explicitly.
#!/usr/bin/env python
import ctypes
from ctypes import windll
from struct import pack, unpack
from time import sleep
OpenProcess = windll.kernel32.OpenProcess
ReadProcessMemory = windll.kernel32.ReadProcessMemory
def ReadBytes(processHandle, address, size):
localBuffer = (ctypes.c_ubyte * size)()
numRead = ctypes.c_uint(0)
ReadProcessMemory(processHandle, address, ctypes.byref(localBuffer), size, ctypes.byref(numRead))
return ''.join(pack('<B', localBuffer[i]) for i in xrange(size))
def main():
processId = 6868
processHandle = OpenProcess(0x1f0fff, False, processId)
appBase = unpack('<I', ReadBytes(processHandle, 0x694B00, 4))[0]
while True:
sleep(0.5)
numWalls = unpack('<I', ReadBytes(processHandle, appBase + 0x2930, 4))[0]
for i in xrange(numWalls):
# Read wall for wall. Could also read the entire thing in one big chunk
# and then get the wallData from chunk[i*0x14:i*0x14+0x14].
wallData = ReadBytes(processHandle, appBase + 0x220 + i * 0x14, 0x14)
slot = unpack('<i', wallData[0: 4])[0]
distance = unpack('<i', wallData[4: 8])[0]
width = unpack('<i', wallData[8:12])[0]
print('[wall %2d] slot: %d, distance: %d, width: %d' % (i, slot, distance, width))
if __name__ == '__main__':
main()
I have it working to some extent, but it seems to make the wrong choice sometimes (usually on the same type of wall pattern).
My logic that determines the target slot is here: https://github.com/Darkman802/super-hexagon-bot/blob/master/bot.py#L162
The last check of the if block on line 168: -1 < wall['slot'] < (num_slots - 1)
was because some of the walls were saying they were in slot number 22, which is a bit weird.
I noticed something extra about the walls that the bot hits. On the first level (hexagon) it usually hits the walls that have a brighter color to them. If it's about to hit a wall and the wall changes color to be a bit darker like the other walls, the bot will immediately change slots to avoid it.
I don't have a clue what could be causing the bot to hit walls that are brighter, or how that ties in to the wall data that is read.
It seems I was mistaken, the walls are only brighter because the player is in the lane.
Just wondering if you were still around. I left the project alone for a while, but I'm wanting to finish it now and I'm still having the same problem from the comment right after your last one.
You might have to take the slot modulo the number of slots; not sure why you are getting a number 22 there sometimes.
I can't really help you with the bot logic, I've never finished this in my bot either. As long as you keep jumping to empty slots, this should work with instantaneous movements (even though that's cheating).
When I started having trouble I tried to recreate your logic using modulo and such as you have now suggested, but I kept having the same problem. It's strange, it will choose the correct lane 80% of the time, but will almost always choose incorrectly on a certain wall pattern.
I don't know if this matters or not, but when I recreated your logic I still didn't understand the use of modulo. Perhaps I didn't recreate it properly if I didn't understand why you used it? hmm, not sure.