emilwallner/Screenshot-to-code

creating dataset

Kotresh17 opened this issue · 4 comments

Hi, many thanks for sharing the data and code. how can we take it forward, how can we generate more data apart from synthesised data. can we create same kind of dataset for real time html page. if so, then how can we generate .gui files for that. if you have any resource or any thoughts please do share us.

Hi Kotresh,

For the bootstrap version you could write a script that takes screenshots of existing bootstrap website templates and build a DSL vocabulary vocabulary based off that. It should be pretty straightforward with the structure looking like the pix2code datasets and DSL

So for example a website that looks like this: https://imgur.com/a/IF3NxTV

Would have a .gui that looks something like below:

header{
    navigation-top{
        logo,
        menu-right{
            menu-link-active,
            menu-link,
            menu-link,
            menu-link
        }
    }
}
main-heading,
row{
    col-3{
       link{
            image
        }
    }
    col-3{
       link{
            image
        }
    }
    col-3{
       link{
            image
        }
    }
footer{
    row-centered{
       text
    }
}

For the HTML version, quoting the issue from Emil:

#20

“As mentioned in the article, the HTML version does not generalize on new images. The Bootstrap version generalizes on new images but with a capped vocabulary. The evaluation images for the bootstrap version are under /data/eval/ . You can test it here: floydhub/Bootstrap/test_model_accuracy.ipynb

If you want to train it to generalize on a more advanced vocabulary, I'd recommend customizing it to work on the HTML set provided here: https://github.com/harvardnlp/im2markup (on floydhub: --data emilwallner/datasets/100k-html:data)

After that, I'd recommend creating a new dataset. Create a script that generates random websites, say starting with newsletters or blog layouts. Then you can add optical character recognition, fonts, colors and div sizes as you go.

If you build a version for the harvardnlp dataset or a script that generates websites, please make a pull request.”

Hi, thanks for sharing the data and code.
Can you please tell how to create .npz and corresponding .gui files for our custom images. if you have any thoughts please do share us, it will be really helpful for us to proceed.
for example: i have attached basic form image, can you please share your thoughts how to convert this image to .npz and .gui form to train with the model so that i can get the html code for similar images.
screentocode

hi I'm pretty late to this but I was just wondering what is a .gui file and how do you open it ?
thankyou

@yuvarajvc:
You can convert an image to a compressed .npz file using my script here: https://gist.github.com/PaulGwamanda/f91ce9fc9d392c4bcc99c085fd726a34

@salmanahmad10:
Any code editor can view and edit a .gui file.

The .gui name extension convention was used by the original paper (Pix2code) and has no special relevance. The project uses the .gui file to map the corresponding token sequence relationship to it's image pair which has the same name.

ie. image1.png (or .npz when compressed) should have a corresponding .gui file called image1.gui which has it's textual token features representing the description of the image

PS I'm pushing my dev toolkit here which includes 100 *samples and will be happy to sell my whole dataset. Email me at paul.gwamanda@gmail.com