VertaAI/modeldb

Creating local dataset versions not working as expected

Aid91 opened this issue · 3 comments

Aid91 commented

Hi,

Currently I am using the open source version of the modelDB, with the latest docker images for all components:

  • modeldb-backend:2.0.8.1
  • modeldb-proxy:2.0.8.1
  • modeldb-frontend:2.0.8.1
  • modeldb-graphql:2.0.8.1
  • Verta python client versions verta>=0.16.0 (I tried all versions newer than 0.16.0)

When I try the basic local dataset versioning, no metadata about the files/directories is shown in the frontend, and probably because of the same reason no increments in data versions are possible (always a data version of 1 is returned).

Code example:

from verta import Client
from verta.dataset import Path
import os

client = Client("http://localhost:3000")
proj = client.set_project("Test project", desc="Test project")
expt = client.set_experiment("Test experiment", desc="Test experiment")


run = client.set_experiment_run(desc="Test experiment run", attrs={})
dataset = client.set_dataset(name="Test dataset")
dataset_version = dataset.create_version(Path("data.csv"))

Result:

connection successfully established
got existing Project: Test project
got existing Experiment: Test experiment
created new ExperimentRun: Run 551637130906217477
created new Dataset: Test dataset in workspace: personal
created new Dataset Version: 1 for Test dataset

When I change the data.csv file and run the same code again I get again the dataset version 1 (no version increment):

created new Dataset Version: 1 for Test dataset

If I decrease the python client version to verta==0.15.* dataset versioning works again, but some methods like dataset.get_latest_version() throw an exception: HTTPError: 501 Server Error: Method ai.verta.modeldb.DatasetVersionService/getDatasetVersionById is unimplemented for url: ...

This leads to my final question: Is latest open source version of the ModelDB supporting local dataset versioning? If so, which component versions (modeldb-backend, modeldb-proxy, etc) and Python client version are compatible?

Thanks in advance!

Hi @Aid91, thank you for your continued interest in ModelDB!

verta==0.16.0 did involve an overhaul in how dataset versions are captured, and our OSS platform may not fully support its operations. <0.16.0 would be the best bet for core functionality, though a few methods (such as get_latest_version()) may also be absent from OSS.

In the meantime, I shall file a ticket for us at Verta to follow up on.

Hi @convoliution

I am seeing a similar error and am interested to know, if there will be any new OSS releases past 2.0.8.1?

I've tried building the server-side components from the master branch several times, but the builds never succeeded

Also asking because the 2.0.8.1 release contains a vulnerable log4j version and would really need an update
https://github.com/VertaAI/modeldb/blob/v2.0.8.1/backend/pom.xml#L22