/PyPlaysMC

Automating Minecraft with screen caps, image processing, and simulated key strokes.

Primary LanguagePythonMIT LicenseMIT

Documentation

Window Fetcher

This part works to capture the part of the screen that is the Minecraft (MC) instance. The Minecraft window itself locks the cursor to be within it. By letting the user move their cursor around, this will keep track of the min & max x & y, creating a box around the MC instance and nothing else.

Usage

windowFetcher.py has 2 main uses:

  • Get the dimensions of the MC window
  • Capture the MC window

As explained above, getting the dimensions of the MC window is a long (multi-second) process. It is recommended to start by calling display_countdown(duration) which will provide a countdown to allow tabing to the window. Then call get_window_dimensions(duration, fps) and it will capture mouse movement for the duration (all in seconds), refreshing fps times a second.

get_window_dimensions(...), once done, will return a dictionary with the screen bounds. This dictionary then needs to be supplied to capture_area(dict) and it will return the MC window. If the window is moved than this will fail, and the program will need to be either restarted or get_window_dimensions(...) called again.

The function get_mouse_locale(sq_size) can be called to capture a square around the mouse. Note that sq_size is a side length, not radius. Also, even though the MC window has a little cross in the middle, the actual system cursor is free to move around the window, so this does not just capture the center of the screen.

MC OCR

Usage

In order to run the OCR, import OCRReader.py, and input your debug menu screenshot to read_image(img). It will automaticaly isolate the text (polarize image) scan the entire image, and then and read it for Minecraft-font text. It is somewhat slow to read the entire image and is more efficient to rea only where you need to. Use the following as an example of how to do that. Time metrics are also listed within the example

from OCRReader import get_line_y, read_at_line_y
import cv2

img = cv2.imread("DebugMenu.png")
# Cut out only relevant text of top left
img = img[:500, :800] 

# It searches for a line which contains XYZ
y_location_of_important_information = get_line_y(img, "XYZ")

# And now I can call this
coordinate_text = read_at_line_y(img, y_location_of_important_information)

# The text still needs processing afterwards to get floats
# coordinate_text = "XYZ.120.213/2134.324/123.23"


'''
Time metrics
read_img(img) ~=~ 60 ms
read_at_line_y(img, y_cord) ~=~ 4 ms

aprox. 17x time difference (Use caching!!!)
'''

There are a couple quirks.

  • Does not read all characters correctly. Colons and parentheses are the most pressing incorrect chars. They are read incoreectly because the algo was designed around the font in the Minecraft chat, and the debug menu's is different.
  • Any characters which it fails to read at all are replaced with question marks.
  • It does not detect spaces. But, there are newlines between lines.

The biggest quirks are the preconditions:

  • All text is inline - If there are multiple text boxes and the one on the left is shifted vertically then there's an issue. If a line below is shifted horizontally, then there's no issues. This is why in the example there is a line img = img[:500, :800]
  • Input must be correct scale - The text must be exactly 2 pixels thick. Adjust your GUI scale for this
  • Color - I believe this is constant, but it assumes the text is colored at (221, 221, 221)
  • Minimal Low-Handing Letters - In detecting the bottoms of lines, having too many g, j, q, y, etc will throw off the alg

No Touchy

Because the MC font is so pixelated, an algorthim tailored to this font is used. Some important data for this font was generated by

  • OCRCharCodeMaker.py
  • OCRFlagMaker.py

These scripts are not meant to be imported or used (unless you know what you're doing). They were run to generate the relevenat info and then left.