Same record on different pages with PSQL and Order By Non-Unique column
Opened this issue · 3 comments
This is a bug.
Prerequisites
- Are you running the latest version?
- Are you reporting to the correct repository?
- Did you check the documentation?
- Did you perform a cursory search?
Description
This is a bug coming from this: https://stackoverflow.com/questions/13580826/postgresql-repeating-rows-from-limit-offset
Basically, whenever a table column is sorted by, using the table header sort action, if there are more rows with the same value on this column, and this set of data exceeds the page limit(ex Status
column has Active
value set for 100 users, but we only see 10 users per page - we will have 10 pages with Active
users), it can be possible to see the same record from page one, on page two as well, because all rows that are returned have the same value for the non-unique column (ex: Status
). In that case the database is free to return the rows in any order it wants.
On PostgreSQL I've tested and it is indeed generating a random order for the records with the same ordered-by-non-unique-column, and I can see record ID 197 on page 1,3,4...
On MySQL, I see that this is not happening, at least not on laravel-enso.com, on Company table, with Status column. In the link attached someone explained that maybe MySQL uses a cluster index ordering after the order set by query on Status, so that's why it might not behave the same.
With this context set, a possible solution for us, for PostgreSQL, is to to override the following class/method (and bind it):
vendor/laravel-enso/tables/src/Services/Data/Sorts/Sort.php
public function handle(): void
{
$sort = new CustomSort($this->config, $this->query);
if ($sort->applies()) {
$sort->handle();
//INJECT HERE AN ADDITIONAL ALWAYS-ON SORTING ON "ID" HAVING THE SAME TYPE (ASC/DESC) AS EXISTING SORT
} elseif (! $this->query->getQuery()->orders) {
$this->query->orderBy($this->config->template()->get('defaultSort'));
}
}
So that in the end, no matter what the user is sorting by (ex: Status), we will add an additional sorting step, in order to avoid the repeating record issue:
Before:
... ORDER BY Table.Status ...
After:
... ORDER BY Table.Status, Table.Id ...
@aocneanu - would you accept a PR for a change like:
vendor/laravel-enso/tables/src/Services/Data/Sorts/Sort.php
BEFORE:
$sort = new CustomSort($this->config, $this->query);
AFTER:
$sort = App:make('CustomSort', [$this->config, $this->query]);
Opening flexibility to extended and bind this CustomSort locally?
The same problem happens in MySQL too.
I guess a PR where we will consistently enforce the default sort in addition to the custom sort would be a good approach.
I want to test one app with millions of entries and see if this strategy impacts performance. So even though I'm not expecting problems, I still want to try first.
I will report back asap.
I guess a PR where we will consistently enforce the default sort in addition to the custom sort would be a good approach.
I don't think that this is a complete solution. Or may lead to a confusing behavior.
Think about a table where you have set the default sort on date DESC
(for example you see most recent logs on top of the table records). Now, If I'll manually sort by a status
column and enforce the default sort in addition it will result in a order by status asc, date desc
, which does not guarantee sorting on an unique column.
Personally we went for something like:
vendor/laravel-enso/tables/src/Services/Data/Sorts/CustomSort.php
public function handle(): void
{
$this->columns()
->each(fn ($column) => $this->query->when(
$column->get('meta')->get('nullLast'),
fn ($query) => $query->orderByRaw($this->rawSort($column)),
fn ($query) => $query
->orderBy($column->get('data'), $column->get('meta')->get('sort'))
));
// INJECTED THIS CODE:
if (!is_null($this->columns()->first())) {
if ($this->columns()->first()->get('data') != "{$this->query->from}.id") { //avoid setting sorting on ID twice.
$this->query->orderByRaw(
"{$this->query->from}.id {$this->columns()->first()->get('meta')->get('sort')}" //applying the same sort direction as the main sort for on "id".
);
}
}
}
I'm not saying it's the best solution, but it did not interfere with any of the scenarios we've had, so for us is working fine.
There are some assumptions that we've made:
- table has an
id
column from
table is the one with unique ids, and there's not a complicated join situation where main table records are duplicated.