/milvus

Ruby wrapper for the Milvus vector search database API

Primary LanguageRubyApache License 2.0Apache-2.0

Milvus

Milvus logo    Ruby logo

Ruby wrapper for the Milvus vector search database API.

Part of the Langchain.rb stack.

Available for paid consulting engagements! Email me.

Tests status Gem Version Docs License X

API Docs

https://docs.zilliz.com/reference/restful/data-plane-v2

TODOs

Installation

Install the gem and add to the application's Gemfile by executing:

$ bundle add milvus

If bundler is not being used to manage dependencies, install the gem by executing:

$ gem install milvus

Usage

Instantiating API client

require 'milvus'

client = Milvus::Client.new(
    url: 'http://localhost:19530'
)

Using the Collections endpoints

# Check if the collection exists.
client.collections.has(collection_name: "example_collection")
# Rename a collection.
client.collections.rename(collection_name: "example_collection", new_collection_name: "example_collection")
# Get collection stats
client.collections.get_stats(collection_name: "example_collection")
# Data types: https://github.com/patterns-ai-core/milvus/blob/main/lib/milvus/constants.rb

# Creating a new collection schema
client.collections.create(
  collection_name: "example_collection",
  auto_id: true,
  fields: [
    {
      fieldName: "book_id",
      isPrimary: true,
      autoID: false,
      dataType: "Int64"
    },
    {
      fieldName: "content",
      dataType: "VarChar",
      elementTypeParams: {
        max_length: "512"
      }
    },
    {
      fieldName: "vector",
      dataType: "FloatVector",
      elementTypeParams: {
        dim: 1536
      }
    }
  ]
)
# Descrbie the collection
client.collections.describe(collection_name: "example_collection")
# Drop the collection
client.collections.drop(collection_name: "example_collection")
# Load the collection to memory before a search or a query
client.collections.load(collection_name: "example_collection")
# Load status of a specific collection.
client.collections.get_load_state(collection_name: "example_collection")
# List all collections in the specified database.
client.collections.list
# Release a collection from memory after a search or a query to reduce memory usage
client.collections.release(collection_name: "example_collection")

Inserting Data

client.entities.insert(
  collection_name: "example_collection",
  data: [
    { id: 1, content: "The quick brown fox jumps over the lazy dog", vector: ([0.1]*1536) },
    { id: 2, content: "Lorem ipsum dolor sit amet, consectetur adipiscing elit", vector: ([0.2]*1536) },
    { id: 3, content: "Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua", vector: ([0.3]*1536) }
  ]  
)
# Delete the entities with the boolean expression you created
client.entities.delete(
  collection_name: "example_collection",
  expression: "book_id in [0,1]"
)
# Inserts new records into the database or updates existing ones.
client.entities.upsert()
# Get specific entities by their IDs
client.entities.get()

Indexes

# Create an index
index_params = [
  {
    metricType: "L2",
    fieldName: "vector",
    indexName: "vector_idx",
    indexConfig: {
      index_type: "AUTOINDEX"
    }
  }
]

client.indexes.create(
  collection_name: "example_collection",
  index_params: index_params
)
# Describe an index
client.indexes.describe(
  collection_name: "example_collection",
  index_name: "example_index"
)
# List indexes
client.indexes.list(
  collection_name: "example_collection"
)
# Drop an index
client.indexes.drop(
  collection_name: "example_collection",
  index_name: "example_index"
)

Search, Querying & Hybrid Search

client.entities.search(
  collection_name: "example_collection",
  anns_field: "vectors",
  data: [embedding],
  filter: "id in [450847466900987454]"
)
client.entities.query(
  collection_name: "example_collection",
  filter: "id in [450847466900987455, 450847466900987454]"
)
client.entities.hybrid_search(
  collection_name: "example_collection",
  search: [{
    filter: "id in [450847466900987455]",
    data: [embedding],
    annsField: "vectors",
    limit: 10,
    outputFields: ["content", "id"]
  }],
  rerank: {
    "strategy": "rrf",
    "params": {
      "k": 10
    }
  },
  limit: 10,
  output_fields: ["content", "id"]
)

Partitions

# List partitions
client.partitions.list(
  collection_name: "example_collection"
)
# Create a partition
client.partitions.create(
  collection_name: "example_collection",
  partition_name: "example_partition"
)
# Check if a partition exists
client.partitions.has(
  collection_name: "example_collection",
  partition_name: "example_partition"
)
# Load partition data into memory
client.partitions.load(
  collection_name: "example_collection",
  partition_names: ["example_partition"]
)
# Release partition data from memory
client.partitions.release(
  collection_name: "example_collection",
  partition_names: ["example_partition"]
)
# Get statistics of a partition
client.partitions.get_stats(
  collection_name: "example_collection",
  partition_name: "example_partition"
)
# Drop a partition
client.partitions.drop(
  collection_name: "example_collection",
  partition_name: "example_partition"
)

Roles

# List roles available on the server
client.roles.list
# Describe the role
client.roles.describe(role_name: 'public')

Users

# Create new user
client.users.create(user_name: 'user_name', password: 'password')
# List of roles assigned to the user
client.users.describe(user_name: 'user_name')
# List all users in the specified database.
client.users.list
# Drop existing user
client.users.drop(user_name: 'user_name')
# Update password for the user
client.users.update_password(user_name: 'user_name', password: 'old_password', new_password: 'new_password')
# Grant role to the user
client.users.grant_role(user_name: 'user_name', role_name: 'admin')
# Revoke role from the user 
client.users.revoke_role(user_name: 'user_name', role_name: 'admin')

Aliases

# Lists all existing collection aliases in the specified database
client.aliases.list
# Describes the details of a specific alias
client.aliases.describe
# Reassigns the alias of one collection to another.
client.aliases.alter
# Drops a specified alias
client.aliases.drop
# Creates an alias for an existing collection
client.aliases.create

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

Development with Docker

Run docker compose run --rm ruby_app bash and install required gems (bundle install). It will give you a fully working development environment with Milvus services and gem's code.

For example inside docker container run bin/console and inside the ruby console:

client = Milvus::Client.new(url: ENV["MILVUS_URL"])
client.collections.list

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/patterns-ai-core/milvus.

License

milvus is licensed under the Apache License, Version 2.0. View a copy of the License file.