/ImageTokenizer

imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

ImageTokenizer: Unified Image and Video Tokenization

Welcome to the ImageTokenizer repository! 🎉 This Python package is designed to simplify the process of image and video tokenization, a crucial step for various applications such as image/video generation and understanding. We provide a variety of popular tokenizers with a simple and unified interface, making your coding experience seamless and efficient. 🛠️

⚠️💡 Note that this project is still in its early stages of development. We welcome any contributions from the community to help us improve and expand the package. Please make sure star and fork the repository if you find it useful. We are tacking on some awesome applications with imagetokenizer such as image/video generation and understanding. Stay tuned!

Features

  • Unified Interface: A consistent API for all supported tokenizers.
  • Extensive Support: Covers a range of popular image and video tokenizers.
  • Easy Integration: Quick setup and integration with your projects.
  • Different ImageTokenizers: Support Magvit2, OmniTokenizer, Titok etc.

Updates

  • 🔥2024.06.22: Titok were supported now! This most minimal tokens num tokenizer as for now;
  • 🔥2024.06.22: OmniTokenizer supported now!

Supported Tokenizers

Here's a list of the current supported image tokenizers:

  • OmniTokenizer: Versatile tokenizer capable of handling both images and videos.
  • OpenMagvit2: An open-source version of Magvit2, renowned for its excellent results.

Getting Started

To get started with ImageTokenizer, follow these simple steps:

Installation

You can install ImageTokenizer using pip:

pip install imagetokenizer

Usage

Here's a quick example of how to use OmniTokenizer:

from imagetokenizer import Magvit2Tokenizer

# Initialize the tokenizer
image_tokenizer = Magvit2Tokenizer()

# Tokenize an image
quants, embedding, codebook_indices = image_tokenizer.encode("path_to_your_image.jpg")

# Print the tokens
print(image_tokens)

image = image_tokenizer.decode(quants)

Documentation

For more detailed information and examples, please refer to our official documentation.

Contributing

We welcome contributions! If you have an idea for a new tokenizer or want to improve existing ones, feel free to submit a pull request or create an issue. 🔧

License

ImageTokenizer is open-source and available under the MIT License.

Community

Acknowledgements

We would like to thank all the contributors and the community for their support and feedback. 🙏