ClipImageEncoder

ClipImageEncoder is a class that wraps the image embedding functionality from the CLIP model.

The CLIP model originally was proposed in Learning Transferable Visual Models From Natural Language Supervision.

ClipImageEncoder encode images stored in the blob attribute of the Document and saves the encoding in the embedding attribute.

Prerequisites

None

Usages

Via JinaHub (🚧W.I.P.)

Use the prebuilt images from JinaHub in your python codes,

from jina import Flow
	
f = Flow().add(
        uses='jinahub+docker://ClipImageEncoder',
        volumes='/your_home_folder/.cache/clip:/root/.cache/clip')

or in the .yml config.

jtype: Flow
pods:
  - name: encoder
    uses: 'jinahub+docker://ClipImageEncoder'
    volumes: '/your_home_folder/.cache/clip:/root/.cache/clip'

Via Pypi

Install the jinahub-clip-image

pip install git+https://github.com/jina-ai/executor-clip-image.git

Use jinahub-clip-image in your code

from jinahub.encoder.clip_image import ClipImageEncoder
from jina import Flow

f = Flow().add(uses=ClipImageEncoder)

Via Docker

Clone the repo and build the docker image

git clone https://github.com/jina-ai/executor-clip-image.git
cd executor-clip-image
docker build -t jinahub-clip-image .

Use jinahub-clip-image in your codes

from jina import Flow

f = Flow().add(
        uses='docker://jinahub-clip-image:latest',
        volumes='/your_home_folder/.cache/clip:/root/.cache/clip')

Example

f = Flow().add(uses='jinahub+docker://ClipImageEncoder',
               volumes='/Users/nanwang/.cache/clip:/root/.cache/clip')


def check_resp(resp):
    for _doc in resp.data.docs:
        doc = Document(_doc)
        print(f'embedding shape: {doc.embedding.shape}')


with f:
    f.post(on='foo',
           inputs=Document(blob=np.ones((800, 224, 3), dtype=np.uint8)),
           on_done=check_resp)

Inputs

Documents with blob of the shape Height x Width x 3. By default, the input blob must be an ndarray with dtype=uint8. The Height and Width can have arbitrary values. When setting use_default_preprocessing=False, the input blob must have the size of 224x224x3 with dtype=float32.

Returns

Documents with embedding fields filled with an ndarray of the shape 512 with dtype=nfloat32.

mapleeit/executor-clip-image