/Portfolio-AI-DesignToCode

Website generation from a drawn wireframe. This is an attempt to show the usage of AI (OCR) in web development and how it can help reduce time spent in the workflow from drawing the idea to the actual code.

Primary LanguageCSSMIT LicenseMIT

Portfolio-AI-DesignToCode

Contributors Forks Stargazers Issues MIT License

About

header2

Introduction

DesignToCode is a project aiming to show a brief example of AI's use in web development to automate tedious and repititive tasks such as the conversion of the idea (Design) into a real website (Code). This is the HTML and CSS programming respectively.

In summary, computer vision is used to analyse a wireframe to parse all its useful components like text to shapes into computable data for the website to be generated from.

Below, the steps will be explained separately and in order to show the life cycle from the drawing to the HTML/CSS generated website.

No training is necessary for the given examples, only the needed programs. (unless Tesseract is set to work on another language than English or if it is not powerful enough)

All steps in the workflow could be potentially replaced with alternatives and even be refactored.

NB: This project is not made to create semantically and/or production ready websites.

Workflow

The lifecycle of this project has been made by steps which contain its own responsability in order to make it easier to understand and follow.

Here is a brief list with a picture below :

  1. Wireframe Drawing.
  2. Optical Character Recognition (OCR)
  3. Shape Recognition
  4. Data Parsing
  5. Data Mapping
  6. Website Generation

about_flow

Wireframe Drawing

This is the part with the most human interaction. The user will draw a website in a wireframe nomenclature with a few rules such as:

  • A picture is represented with a box containing a "X" in it covering the full area.

    image

  • An element's type is specified at the top-left or bottom-right most corner from the inside or the outside.

    image image image

  • It is not necessary to draw everything perfectly at the pixel close, but the margin of error should not be above 10 pixels for each side of a shape.

Optical Character Recognition (OCR)

When the wireframe is done properly, the OCR with Tesseract-OCR and Python will find the words written on it.

The result is then given as a dictionnary to get every parameters such as the coordinates, text and percentage of accuracy.

The accepted percentage must be well balanced. Elsewhere, words could be missing (too high %) or they could not make much sense (too low %).

Shape Recognition

It is the same principle than step 2., but there is a different code logic to get the shapes instead of the words.

With more form detections, it is possible to have less words on the wireframe. By example, the picture shape does not need to have an annotation since it can be recognized without ambiguities.

Data Parsing

All dictionnaries from step 2 and 3 are parsed into better formats.

Data Mapping

The data is mapped together to have all the information in one place.

The data about characters and shapes are put together to have distinct elements that can be computed into the website.

Website Generation

With the data parsed and mapped, the only step left is to generate the elements and create the website.

This part will analyse the data to get what specifications an element has in the HTML and CSS aspect.

When it is completed, the elements are arranged together in the respective sheets and the index page can be launched in your favorite browser.

Directories and Files

Project's Tree
  |  |- wireframes
  |- .gitignore             #
  |- LICENSE                #
  |- README.md              # This file
  |- requirements.txt       #

Installation

For this project to work, some programs needs to be installed with the required Python libraries:

  • Python 3.x
  • Jupyter Notebook (optional, but necessary for the notebooks)
  • Tesseract-OCR
pip install -r requirements.txt

All imports are on the notebooks. The notebooks must be executed in order (4-..., 5-..., 6-..., with 1,2,3 that are optional) with read/write permission on the sub-directories and files of the project.

NB: The default path of Tesseract may not work for your environment. If so, you may have to find the executable and chahnge the path to yours.

// Default path
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

How to Execute

The most simple way to run the project is to use Jupyter Notebook and run all cells of each notebooks in order. The .csv files should be generated almost instantly with the given examples.

It is important to do it in order and wait for the csv file creation because they are reused as the workflow continues.

Contribution

Contributions are always welcome, thank you for you time. Here are the steps to do so.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/MyContribution)
  3. Commit your Changes (git commit -m 'Add MyContribution')
  4. Push to the Branch (git push origin feature/MyContribution)
  5. Open a Pull Request

License

See the LICENSE file at the root of the project directory for more information.

Acknowlegements and Sources

Readings of articles and projects

Programs