moleculerjs/moleculer-db

Sequelize: What is the value of counting DISTINCT on a primary key?

jodysimpson opened this issue · 5 comments

const strictCountRule = { distinct: true, col: `${this.model.name ||this.model.tableName}.${this.model.primaryKeyAttribute}` };

This change was introduced to add DISTINCT(<primaryKey>) to the counts. With Postgres in particular, this dramatically slows down queries on large tables because the planner will try to sort the values. Link to Stackoverflow discussing this issue. I'm not sure if I'm missing something, but wouldn't adding DISTINCT in this case be pointless because you're always counting the primary key which is unique to begin with?

This change is added by the #304
@DenisFerrero could you explain it?

Ok take for example a user-auths binding through an M-N, I applied the below defaultScope to the users

User.addScope('defaultScope', {
  include: [{
    model: models.auths, as: 'auths',
    through: { attributes: [] }
    attributes: ['id', 'name'],
    order: [['name', 'ASC']]
  }]
});

Then in your database, you have the following data

AUTHS

id name
1 application
2 administration
3 debugging

USERS

id name
1 John Doe

USERS_AUTHS (M-N table)

user_id auth_id
1 1
1 2
1 3

When you try to call the list action for the users the data query will look like this

SELECT 
  user.id AS "user.id", user.name AS "user.name",
  auth.id AS "auth.id", auth.name AS "auth.name"
FROM users AS user
JOIN users_auths AS ua ON ua.user_id = user.id
JOIN auths AS auth ON au.auth_id = auth.id

and the results like this

user.id user.name auths.id auths.id
1 John Doe 1 application
1 John Doe 2 administration
1 John Doe 3 debugging

The count will have a similar pattern but will count each row so the results will be 3 and not 1.
The DISTINCT(user.id) just helps to count just once the user and get the right result
This defaultScope pattern's pretty much unconventional as services should be independent of each other (errors as a newbie moleculer dev).
The same result can be performed with a basic default scope and calling a populate method on request.

If you decide to remove it just let me know so I can dedicate some time to fix this behavior in my code

@DenisFerrero I feel like your example breaks the intent of Moleculer DB having one model per service.

Additionally, a count query designed to paginate a larger query should return the same overall count as the original query. Adding behavior to accommodate an unintended pattern does not seem like the best choice to me, especially when the change dramatically slows down the intended usage pattern for larger tables.

Yeah, you're right it was my mistake when I started designing the application. Currently, I don't have much time but I'll revert it to the previous version and create a PR

Sorry if I disappeared for a while but I have been on other works. I created a Pull Request (#377) with the revert of the changes as we discussed