dotnet/efcore

Uniquify and validate Check Constraint name

Neme12 opened this issue ยท 15 comments

HasCheckConstraint has a name parameter that it uses as the name of the constraint in the database.

Constraints are database objects and their names must be unique in the whole database (not just the table they belong to... as I found out the hard way by getting into a conflict ๐Ÿ˜„), so to mitigate this we should a use naming convention. Right now we have to do this manually and prepend our constraint names with the name of the table like this:

builder.HasCheckConstraint("AspNetUsers_UserName_SameAsEmail", ...);
builder.HasCheckConstraint("AspNetUsers_Created_NotBySelf", ...);
builder.HasCheckConstraint("AspNetUsers_LastLogonAtUtc_LastLogonInfo_BothOrNonePresent", ...);

That's why I created an extension method that does this for me:

/// <summary>
/// Calls <see cref="RelationalEntityTypeBuilderExtensions.HasCheckConstraint{TEntity}(EntityTypeBuilder{TEntity}, string, string)"/>
/// with the name of the table prepended to <paramref name="name"/> to prevent constraint name conflicts.
/// </summary>
public static EntityTypeBuilder<TEntity> HasLocalCheckConstraint<TEntity>(this EntityTypeBuilder<TEntity> entityTypeBuilder, string name, string sql) where TEntity : class
{
    var tableName = entityTypeBuilder.Metadata.GetTableName();
    return entityTypeBuilder.HasCheckConstraint(FormattableString.Invariant($"CK_{tableName}_{name}"), sql);
}

and also uses the standard CK prefix as a convention. Then I only use HasLocalCheckConstraint to create check constraints and have to be careful to never call HasCheckConstraint directly.

It would be nice if EF Core had this feature builtin or could do this automatically. It would also ensure that all database object created by EF Core follow the naming convention that it uses now, with a two letter prefix for the type of the object (such as PK, AK, FK, IX).

Just changing the existing behavior of HasCheckConstraint would of course be a huge breaking change and adding a parameter to opt-in to this behavior would be clumsy and it should really be the default in my opinion.

I also noticed that while methods like HasDefaultValue, HasDefaultValueSql and HasComputedColumnSql follow the convention where if they take SQL, their names end with Sql, and if there's an expression-based version, it doesn't. What if the existing method was deprecated and the new name would be HasCheckConstraintSql with this new naming behavior? If there's an expression-based version added later (#15409), that would take the name HasCheckConstraint and still follow the new behavior because it would be a separate overload from the deprecated one (or the deprecated one could be gone by that time).

To allow for compatibility with existing constraint names, the new method could take an optional parameter (bool or enum) to disable the name transformation, but it should be on by default in my opinion.

Alternatively, rather than differentiating the behavior between HasDefaultValue and HasDefaultValueSql and/or adding a parameter to specify the behavior, maybe this could be a new global convention that is applied by default, and for users that want to keep existing behavior, they could disable the convention globally?

Note for triage: we should do a review of the CheckConstraint code. Looks like the design could be lacking in a few areas. Looking at the PR it seems like a couple of people did review it, but we still missed things; we should discuss in the retrospective.

My current solution to this is a custom convention, if anyone is interested:

public sealed class ScopedCheckConstraintNameConvention : IModelFinalizedConvention
{
    public void ProcessModelFinalized(IConventionModelBuilder modelBuilder, IConventionContext<IConventionModelBuilder> context)
    {
        foreach (var entityType in modelBuilder.Metadata.GetEntityTypes())
        {
            foreach (var checkConstraint in entityType.GetCheckConstraints().ToList())
            {
                var newName = FormattableString.Invariant($"CK_{entityType.GetTableName()}_{checkConstraint.Name}");

                entityType.RemoveCheckConstraint(checkConstraint.Name);
                entityType.AddCheckConstraint(newName, checkConstraint.Sql);
            }
        }
    }
}

This is better than the extension method because if you set the table name after calling the extension method, it would still use the old table name.

I tried hard but couldn't find a single piece of documentation about constraint naming on SqlServer. It always talks about naming rules but not the scope.

Based on empirical testing data,
There are 2 types of constraints

  • Type 1: Which can be specified in Create table script. PK/AK/Check/FK
  • Type 2: Which needs to be separate from create table. Index/Statistics

Type1 needs to have unique name across database. PK on blog table with same name as AK on post table also fails.
Type2 needs to have same name in given table.
(side effect of above, you can have PK & index with same name as long as in different table.)

  • Check Constraint review
    • In initial implementation check constraints were stored in serialized form like Sequences.
    • We later moved storage to Dictionary<string, CheckConstraint> giving us unique names. Defining a check constraint with same name replaced previous one in both the storage.
    • Then we moved them to entityType since they are in scope of table. But we never added any validation or unification across database. So replacing behavior would persist in the scope of given entityType only.
    • Check Constraint names are optional in SQL but required in model building.

We have validation to see if PK/AK/FK/Index names are unique in given table also accounting for table sharing.
Missing parts:

  • Truncate and uniquify check constraint names
  • Validate type1 unique-ness across whole database.
  • Validate unique name across different types like PK & Index with same name as appropriate scope.

Type1 needs to have unique name across database. PK on blog table with same name as AK on post table also fails.

In SqlServer, Type1 names need to be unique across a schema not the entire database, the code below runs fine.

create schema a
create schema b

create table a.tableName (columnName int, constraint ConstraintName check (columnName=1))
create table b.tableName (columnName int, constraint ConstraintName check (columnName=1))

Supporting documentation - https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql

CONSTRAINT Is an optional keyword that indicates the start of the definition of a PRIMARY KEY, NOT NULL, UNIQUE, FOREIGN KEY, or CHECK constraint.

constraint_name Is the name of a constraint. Constraint names must be unique within the schema to which the table belongs.

Removing from backlog to discuss what to do here for the plugin verses manually written calls to HasCheckConstraint.

Notes from team discussion:

  • This should be a convention in the product, not in the plugin
  • As with other conventions like this, it should only truncate and uniquify names set by convention, not names set explicitly through the fluent API.
  • We should add an overload of HasCheckContraint where the name will be created automatically

If you plan to add an overload to HasCheckConstraint that doesn't take a name, it would be great if you could also add an overload to HasDefaultValue that does take a name to align the two.

roji commented

For reference, SQL Server and MySQL require check constraints to be unique across tables, while PostgreSQL and Sqlite do not.

@AndriySvyryd if it's not too hard, we could allow providers to specify whether uniquification is needed, what do you think? Though I agree the extra table name in the CK name isn't a big problem.

@roji I've sent out #25268, naming conventions should become easy to replace for providers, plugins and users with #9329

roji commented

Great, thanks!

Great - all further migrations broken from that point ๐Ÿ‘Ž - trying to drop existing constraints by a name which they never have got in the past.

Great - all further migrations broken from that point ๐Ÿ‘Ž - trying to drop existing constraints by a name which they never have got in the past.

https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-6.0/breaking-changes#unique-check-constraints outlines how to mitigate the breaking change

I'm sorry @ryan-carbon but the mitigation shown there doesn't help / is not what I want.

The new unique names are a good thing and I for sure want them for all new stuff or any constraints I modify.

So I finally changed the name parameter of migrationBuilder.DropCheckConstraint() in Up() method - and the according name of migrationBuilder.AddCheckConstraint() in Down() method - in the 20220124_Migration.cs which would not run.

@springy76 We reverted a part of the breaking change in #27142, it will be shipped in 6.0.2

Let us know if you are still experiencing the issue after the patch