/BreakingCaptcha

Breaking CAPTCHA project aims to build ML models for solving CAPTCHAs, enhance ReCAPTCHA security, and develop a user-friendly web app. Required skills include programming, web dev, DB management, cryptography, AI/ML, and collaboration. Research and critical thinking abilities are also essential.

Primary LanguageJupyter Notebook

Breaking CAPTCHA Project

Deakin University Capstone, Trimester 1, 2023

Project Description

The Breaking CAPTCHA project aims to develop advanced machine learning, AI, and computer vision models to solve CAPTCHA problems autonomously.

The project’s short-term goal is to improve and secure the ReCAPTCHA through research and build a secure, user-friendly ReCAPTCHA version. The project will also focus on developing advanced machine learning, AI, and computer vision models, testing and validating them for accuracy, reliability, and effectiveness, and using them to solve CAPTCHA problems autonomously.

In the long term, the Breaking CAPTCHA project aims to provide end-users with a seamless interface to solve CAPTCHA problems with greater accuracy, reliability, and speed. To achieve this, the project will build a web application that allows customers to upload ReCAPTCHA images, text, voice, and video to verify their security. By developing cutting-edge technology and providing users with a secure and user-friendly experience, the Breaking CAPTCHA project aims to make online interactions more seamless and secure.

Aims for Trimester

The aims for the trimester include the following:

  • Conducting extensive research and development to identify practical approaches to solving CAPTCHA problems.

  • The research team will work alongside the AI developer team to create a better version of reCAPTCHA that has a more secure and user-friendly interface.

  • Rigorously testing and validating the models for accuracy, reliability, and effectiveness.

  • Focus on developing a secure, user-friendly ReCAPTCHA version by applying cryptography, programming skills, artificial intelligence, and machine learning.

Deliverables

The project’s deliverables include:

  • Advanced machine learning, AI, and computer vision models to solve CAPTCHA problems autonomously

  • A web application that allows customers to upload ReCAPTCHA images, text, voice, and video to verify their security and a secure, user-friendly ReCAPTCHA version that leverages cryptography, programming skills, artificial intelligence, and machine learning.

  • Extensive research and development reports and documentation on the testing and validation of the models.

Final Update (T1 2023)

Milestones Achieved: We have achieved several significant milestones since the last progress report. Our team built a prototype using Balsamiq, which was a considerable accomplishment. We created 11 pages, including the homepage, user registration page, login page, password reset page, and access control page. We also developed pages for object detection, text detection, and voice detection, where users can upload images, audio files, and text to get results determined by our models.

Tasks Completed: One of the significant tasks we completed was the user registration page, where website visitors can create user accounts with unique usernames and passwords. We also made a login and logout feature that allows users to log in and out of their accounts securely. In addition, we developed a password reset link that can be sent to registered email addresses for users who forget their passwords. We implemented access control to limit access to different parts of the website based on user roles and permissions. We also created a navbar to allow users to navigate easily through the website, and the homepage displays the most recent comments.

Challenges Overcome: We encountered some challenges during the development process, including setting up a Google Colab environment and a TensorFlow environment through Anaconda. However, our team overcame these challenges and made significant progress on the project.

AI Progress: We made significant progress in AI during this period. We translated the previous team’s audio model from Pytorch to Tensorflow, which was a considerable achievement. We also created a 70% accurate image recognition model and an 85% accurate image recognition model using VGG16. In addition, we created a ‘Faster RCNN’ image recognition model, a YOLO image recognition model, and a DETR image recognition model.

We are also working on packaging a text recognition model for the back end, which will be a critical component of our website’s functionality. These AI models will enable us to provide users with accurate and reliable object, voice, and text detection services.

Conclusion: In summary, we have made significant progress since the last progress report, achieving several important milestones and completing critical tasks. We overcame challenges in setting up our AI environment and developed several accurate image recognition models and a text recognition model for our website’s backend.

Looking ahead, we will continue to refine and improve our AI models and integrate them with our website’s front end. We are confident that we will complete the project on time and to a high standard, and we look forward to sharing our progress with you in the next report.

Operating GitHub (Pull requests)

Why GitHub?

GitHub is a version control system that allows for collaboration amongst team members working on a project. It is also publishing a platform that allows all members to view the changes that have been made to strcture of the project or the code. It enables us to keep track of the codebase, save our projects as it develop and revisit prior points of the project should it be required.

Collaborating Through GitHub

You can use GitHub either via its web version or desktop application. Branches are the central operating mechanism GitHub uses for collaboration. It allows us to have different versions of a repository simultaneously without making an change to the main source of code. The work done on different branches will not show up on the main branch until you merge it, allowing for experimentation with the code.

Pull requests

On the web version of GitHub, you simply have to select the branch you want to work on, be in the 'main' or other branches after which, you will select the edit icon. After adding or making any changes to the code or documentation, scroll down to bottom and select 'Create a new branch for this commit and start a pull request'. Rename the branch any way you'd like and click on 'propose changes'. The following page provides you with an option to 'leave a comment' and you should comment on the exact nature of your changes. This will help other members understand your changes more efficiently. Once done, click 'Create pull request'. It then brings you to a page where you 'push' your changes to the selected branch. Again, it provides you with an option to 'leave a comment' and once you have explained the changes made, select 'Merge pull request' and then select 'Confirm merge'. This step will merge all the changes with the main branch that was selected. Finally, select 'delete branch', which deletes the branch that was copied and you were working on prior to the merge. Go back to the main page to view your changes.

Video guide on operating GitHub

For a step-by-step guide on how to operate GitHub, you can click on this link: https://www.youtube.com/watch?v=RGOj5yH7evk. The link provides a good foundation on understanding not only GitHub but also Git.