/PicQuery

🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)

Primary LanguageKotlinMIT LicenseMIT

PicQuery

中文| English

cover_en

🔍 Search for your local images with natural language, running completely offline. For example, "a laptop on the desk", "sunset by the sea", "kitty in the grass", and so on.

  • Totally free, NO in-app purchases
  • Support both English and Chinese
  • Indexing and searching of images works completely offline without worrying about privacy
  • Show results in less than 1 second when searching for 8,000+ photos
  • Wait for indexing on the first time you launch, and search immediately afterward

Installation

  • Google Play - Search for “PicQuery”
  • Download APK from Release
  • If you have trouble accessing the above resources, please see here

🍎 For iOS users, please refer to Queryable (Code), the inspiration behind this application, developed by @mazzzystar.

Implementation

Thanks to @mazzzystar and @Young-Flash for their assistance during the development. The discussion can be viewed here.

PicQuery is powered by OpenAI's CLIP model.

First, the images to be searched are encoded into vectors using an image encoder and stored in a database. The text provided by the user during the search is also encoded into a vector. The encoded text vector is then compared with the indexed image vectors to calculate the similarity. The top K images with the highest similarity scores are selected as the query results.

Build & Run

To build this project, you need to obtain a quantized CLIP model.

Run the scripts in this jupyter notebook step by step. When you run into the "You are done" section, you should get the following model files in ./result directory:

  • clip-image-int8.ort
  • clip-text-int8.ort

If you don't want to run the scripts, you may directly download them from Google Drive.

Put them into app\src\main\assets and you're ready to go.

Acknowledgment

License

This project is open-source under an MIT license. All rights reserved.