modlinltd/django-advanced-filters

why distinct is used in AdvancedListFilters.queryset() function?

jangedoo opened this issue · 1 comments

I'm curious why queryset.filter(query).distinct() is used instead of queryset.filter(query) in AdvancedListFilters class. It is affecting the query performance quite a lot. For my setup, sql query to fetch items for a page of 40 items in admin view takes 537ms whereas if I remove the distinct call, then it takes 4 ms.
This is a huge difference and I'm not sure if doing a distinct helps with anything at all because pk is always selected.

Excellent question @jangedoo, honestly I have no recollection of why it's there... but, it feels like it came as a result of some de-duplication request from a user... This is further supported by this statement in the distinct() docs

By default, a QuerySet will not eliminate duplicate rows. In practice, this is rarely a problem, because simple queries such as Blog.objects.all() don’t introduce the possibility of duplicate result rows. However, if your query spans multiple tables, it’s possible to get duplicate results when a QuerySet is evaluated. That’s when you’d use distinct().

That said, I'd say that this performance hit warrants a toggle option option in the form/model to turn De-duplication on/off when necessary.

If you're interested in implementing this toggle, PRs are more than welcome 🙇