/ml-captcha-solver

Training model to solve alphanumeric captchas.

Primary LanguageJupyter NotebookMIT LicenseMIT

TensorFlow OCR model for solving Login Captchas

Developed a 2023 side project to train a model that solves CAPTCHAs generated by BBDC's login system, automating the booking of practical slots.

Tip

And when I'm old and I've had my fun, I'll share my inventions so that everyone can be superheroes.
Everyone can be a super!
And when everyone's super… no one will be.

🚨🚨🚨 Important Alert 🚨🚨🚨

Warning

This repository serves as an archive of the methods I used during my past project. Please be aware that this and the relevant repositories are NOT GURANTEED to work in their current state. Some tinkering may be required, as I no longer maintain them.

Important

I deeply value any feedback and appreciate all the feedback to improve the exisitng code. However, this is no longer an active project of mine so feel free to create a fork and continue building on the work that has been done.
Thank you for your understanding!

Prerequisites:

Building Dataset - General Workflow:

  1. Retrieve captcha JSON payload from https://booking.bbdc.sg/bbdc-back-service/api/auth/getLoginCaptchaImage using Advance Rest Client with a POST request.
  2. From the JSON Payload decode the base64 string into an image to build your dataset.
  • Example:
    • Refer to example-response-body.json
    • Go to ["data"]["image"]field.
    • Extract Base64 string from data:image/png;base64, <Base64 String>.
    • Convert it from Base64 to an image using the above converter OR in Python3 using the following code.
    import base64
    img_data = "<Base64 String>"
    with open("imageToSave.png", "wb") as fh:
        fh.write(base64.decodebytes(img_data))
  1. Repeat ~1000x to obtain a suitable size for your dataset.
  2. Label each image based on the ACTUAL captcha for easier evaluation.
  3. Split the dataset into 90% Training / 10% Validation.
  4. Train the model using the attached captcha_solver.ipynb file utilising Google Collab T4 GPU runtime.
  5. Download trained model for use with BBDC Booking Bot

BBDC Booking Bot

References

Project Snapshots

Building Dataset

Confidence Testing

TensorBoard Report

Proof of Functionality