microsoft/mssql-django

UUID queries are made with NCHAR

nils-van-zuijlen opened this issue · 3 comments

There are some features which are not supported yet. Please check the Limitations first to see if your bug is listed.

Software versions

  • Django: 4.2
  • mssql-django: 1.4.2
  • python: 3.11.8
  • SQL Server: 14.0.0
  • OS: Linux 6.6.36 NixOS SMP PREEMPT_DYNAMIC Thu Jun 27 11:49:15 UTC 2024 x86_64 GNU/Linux

Table schema and Model

class Foobar(models.Model):
    reference = models.UUIDField(_("reference"), default=uuid.uuid4, primary_key=True)

Database Connection Settings

DATABASES = {
    "default": {
        "ENGINE": "mssql",
        "NAME": os.getenv("DB_NAME"),
        "USER": os.getenv("DB_USER"),
        "PASSWORD": os.getenv("DB_PASSWORD"),
        "HOST": os.getenv("DB_HOST", "127.0.0.1"),
        "PORT": int(os.getenv("DB_PORT", "1433")),
        "OPTIONS": {
            # This database doesn't have any triggers so can use return
            # rows from bulk insert feature
            "return_rows_bulk_insert": True
        },
    }
}

Problem description and steps to reproduce

The UUID column gets created using a CHAR(24) column, and gets a corresponding primary key index, also in CHAR(24).

When executing a query filtering on that column, using Foobar.objects.get(reference=uuid.UUID(...)), the uuid gets passed as an NCHAR to the database. This makes MSSQL do a full index scan, converting all index values to NCHAR to compare them.

Expected behavior and actual behavior

I would expect the UUID to be passed as CHAR values, in order to have parameters match the index and column types without conversion.

Any other details that can be helpful

The workaround we found for our most used queries is to do them manually and insert the uuid directly into the query string, taking care to ensure only valid UUIDs can be inserted at that point to avoid SQL injection. But that approach is not scalable to all queries.

This is caused by https://github.com/django/django/blob/c5d196a65264136ee6795356871a29f3d22ec52f/django/db/models/fields/__init__.py#L2674

As can be tested with the following query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4().hex,)
).fetchall()

[('nvarchar',)]

This can be fixed using native fields, as in #134, because the types will then match between the column and the query:

cursor.execute(
    "SELECT CONVERT(nvarchar(max), SQL_VARIANT_PROPERTY(%s, 'BaseType'))",
    (uuid.uuid4(),)
).fetchall()

[('uniqueidentifier',)]

Thanks for bringing this up. We will look into adding this.