surrealdb/surrealml

Model name and version missing when importing `.surml` file into SurrealB.

Ce11an opened this issue · 0 comments

Example

Example code taken from the README.md:

from sklearn.linear_model import LinearRegression
from surrealml import SurMlFile, Engine
from surrealml.model_templates.datasets.house_linear import HOUSE_LINEAR # click on this HOUSE_LINEAR to see the data


if __name__ == '__main__':

    # train the model
    model = LinearRegression()
    model.fit(HOUSE_LINEAR["inputs"], HOUSE_LINEAR["outputs"])

    # package and save the model
    file = SurMlFile(model=model, name="linear", inputs=HOUSE_LINEAR["inputs"], engine=Engine.SKLEARN)

    # add columns in the order of the inputs to map dictionaries passed in to the model
    file.add_column("squarefoot")
    file.add_column("num_floors")

    # add normalisers for the columns
    file.add_normaliser("squarefoot", "z_score", HOUSE_LINEAR["squarefoot"].mean(), HOUSE_LINEAR["squarefoot"].std())
    file.add_normaliser("num_floors", "z_score", HOUSE_LINEAR["num_floors"].mean(), HOUSE_LINEAR["num_floors"].std())
    file.add_output("house_price", "z_score", HOUSE_LINEAR["outputs"].mean(), HOUSE_LINEAR["outputs"].std())

    # save the file
    file.save(path="./linear.surml")

    # load the file
    new_file = SurMlFile.load(path="./linear.surml", engine=Engine.SKLEARN)

    # Make a prediction (both should be the same due to the perfectly correlated example data)
    print(new_file.buffered_compute(value_map={"squarefoot": 5, "num_floors": 6}))
    print(new_file.raw_compute(input_vector=[5, 6]))

I get the outputs successfully:

[5.013289451599121]
[5.013289451599121]

I can also import the model into the database:

surreal start --log trace --user root --pass root --bind 0.0.0.0:8000 file:mydatabase.db    

and

surreal ml import linear.surml --namespace test --database test --conn http://localhost:8000

gives

2024-02-03T09:42:40.689501Z  INFO surreal::cli::ml::import: The SurrealML file was imported successfully

However, when looking at the trace we see that the model name and version are undefined:

2024-02-03T09:42:40.689088Z DEBUG request:process:executor: surrealdb::dbs::executor: Executing: DEFINE MODEL ml::``<> COMMENT '' PERMISSIONS FULL otel.kind="server" http.request.method="POST" url.path="/ml/import" network.protocol.name="http" network.protocol.version="1.1" otel.name="POST /ml/import" http.route="/ml/import" http.request.id="59568a8f-f6f0-41bf-be09-c29446f6ddbd" client.address="127.0.0.1"

This becomes more apparent when performing a query with the model:

surreal sql --username root --password root --namespace test --database test

running

test/test>  ml::linear<0.0.1>({squarefoot: 1000, num_floors: 1})

gives

["The model 'ml::linear<0.0.1>' does not exist"]

Versions

  • surreal: 1.1.1 for macos on aarch64
  • python: 3.10.11
  • surrealml: latest

Potential Fix

This behaviour arises from add_name and add_version being missing from surml_file.py.

Acceptance

  • Fix import error.
  • Update examples and tests.
  • Create an integration test with SurrealDB to check that a .surml file can be imported and predictions can be queried.
  • Add an error message/warning to the user that they are importing a .surml file into SurrealDB.