A script to programmatically download large sets of images from bing.
This script is useful to amass datasets for various image processing tasks.
The script runs on torch7. Make sure you have it installed.
Additional lua packages are required:
$ luarocks install async
$ luarocks install luasocket
$ luarocks install moses
$ luarocks install graphicsmagick
First you will need to get credentials to use the Bing Image Search API. Follow the steps below to obtain a key from the Azure platform.
- Set up a free Microsoft Azure account
- Go to the Azure portal and choose Create A Resource
- Search and select Bing Search v7, then click Create
- Give the resource a name and choose a pricing tier.
- The free tier is fine but rate-limited to 3 calls per second
- Create a new resource group if necessary
- Next, in the sidebar, go to All Resources and click the name of the resource you just created
- Choose keys in the middle column and copy one of the API keys listed
- Paste your API key in
credentials.lua
.
Once you have your credentials set up, use th
to run the script
from the command line:
th init.lua -q ratajkowski -n 12
These parameters will crawl bing for images matching the query "ratajkowski" and limit the number of results to 12.
The results will be saved in a folder with the same name as the query.
Call the script with no arguments to view more usage information:
th init.lua
Error Code 403: you have exceeded your monthly quota. You will need to upgrade your pricing tier by going into the Azure Portal.
Error Code 429: you are exceeding you calls per second limit. Wait a while and try again.
A full list of error codes can be found here.
If you still have problems, please open an issue.
We currently use Bing image search API v7 (documentation here).