VisionQL is a project to explore the use of declarative queries on top of ML based computer vision. Think SQL for computer vision.
It is a Node.js application written in TypeScript. There is currently not a webserver, Node.js is a thin wrappers around the Google Vision api calls.
TypeScript make is easier and safer to write query against the response.
The goal is that VisionQL should have several backends. Next backend is going to be TensforFlow.js. Each backend will return a result in json. VisionQL should have TypeScript definitions for all of the backends. This will make it easier to query an ensamble of computer vision models.
Currently VisionQL can do 2 Google Vision API calls. For both you need to have a file with service account credentials.
git clone git@github.com:sami-badawi/visionql.git
cd visionql
export GOOGLE_APPLICATION_CREDENTIALS=/home/yourname/yourpath/key.json
npm i
npm run build
node dist/call_face_detect.js --gs_path gs://sami-vision-project/AI-panel-2018-02-15.jpg --query yes
node dist/call_image_label.js --file_path resources/wakeupcat.jpg --query yes
Result of running call_face_detect will be store in file:
output/AI-panel-2018-02-15_jpg/face_detect_result.json
If the face detect program was run with --query yes
it will count number of faces and number of happy faces:
for image: gs://sami-vision-project/AI-panel-2018-02-15.jpg: faceCount: 4; happyFaceCount: 1
The project also has an example output file:
output/example_face_detect_result.json
Result of running call_image_label will be store in file:
output/wakeupcat_jpg/label_detect_result.json
If the label program was run with --query yes
it will output if the picture has cats, dogs and internet memes.
for image: ./resources/wakeupcat.jpg: isMeme: true, hasCat: true, hasDog: false
Currently the application has a few canned queries.
TypeScripts types makes in easy and safe write these queries.
public happyFaceCount(): number {
const firstResult = this.apiVisionResponseArray[0];
return firstResult.faceAnnotations.filter( (face) => face.joyLikelihood === "VERY_LIKELY").length;
}
public hasCat(): boolean {
const firstResult = this.apiVisionResponseArray[0];
return 0 <= firstResult.labelAnnotations.findIndex((label) => label.description === "Cat");
}
There are many good computer vision system available.
First backend for VisionQL is Google Vision API.
It is high quality. You have to be a user of Google Cloud Platform, but it is relatively easy and cheap to get setup to experiment.
TensorFlow.js will be the next backend. The model can run in the browser or on Node.js. There is no need for GCP API keys.
TensorFlow.js has the following 2 models:
The point of VisionQL is that it should work with a more declarative queries. Here is a short discussion of a few candidates for this:
SQL does not lend itself well to this, since it is dealing with flat relational data.
PostGIS is a SQL frontend to a lot of computational geometry code written in C++. It is well suited for dealing with geometric operations on point, lines and polygons. However it is missing hierarchical nature of computer vision.
Mini Kanaren is a logic programming language with several implementation in JavaScritp. That is an option that is worth exploring.
VisionQL is in pre alpha. It is currently a playground for experimenting with Google Vision API results in TypeScript, but it is pretty easy to setup and work with.