vapor/vapor

Query filtering and sorting by distance

Fa1zer opened this issue · 5 comments

I need to filter and sort users by distance(it should not exceed 100 kilometers and the first user should be the closest). I have the coordinates latitude and longitude. I need to do this specifically with query methods filter and sort for good optimization. How can I do this? I'm using PostgreSQL db. If it is impossible to do it with latitude and longitude, but there are ways to store user location without latitude and longitude and calculate distance, you can suggest me such a solution, I think it will work for me. I'm fine with whatever is the most optimized solution.

To perform geometric calculations you can use the PostGIS extension, which unless you want to write a ton of complex maths is the best way. See https://github.com/brokenhandsio/fluent-postgis for a Fluent compatible library

@Fa1zer, Just want to point out that there is an alternative way because there is a subset of CoreLocation available on Linux, including func distanceFromLocation(location: CLLocation) -> CLLocationDistance, that can be used to sort query results:
https://github.com/j22chan/CoreLinuxLocationKit

I'd caution against doing it on the server as opposed to the DB because you'll need to load all the data from the table you want across the network (potentially the entire table which could get quite big) and then filter on that. You're much better off handing that off to a database if you can

@0xTim Well, we're not required to choose one of these methods, we can combine them both. I haven't used GIS functions in Postgres, so I can't say anything about their speed, but similar functions in SQLIte were quite slow (and it's pretty obvious why), so it was convenient to use this technique:

  1. Calculate the coordinates of the corners of a square whose width and height are equal to the required radius.

  2. Select from the database all points from this square. This is an elementary and very fast query. As a result, we get more points than we need, because we have not yet "cut off" the corners of the square and not yet turned it into a circle.

  3. Calculate the distance to the points in memory, discard the extra ones, sort them.

This technique allows us to solve the problem with minimal memory overhead and speed penalty on any database, regardless of whether this database supports any GIS functions.

I want to add that the area of a square is larger than the area of a circle inscribed in it by about 27%. With a uniform distribution of points, as a result of a database query, we will receive, on average, 27% more points than we need. It seems to me that this is a minor overhead, which is quite affordable.