MichaelSolati/geofirestore-js

Query speed

Closed this issue · 6 comments

Hi Michael,

I'm building a dating app and using Firebase with Firestore for the backend.

I'm running into an issue where my users-by-proximity query is taking ages and I'm wondering if you could give some pointers as to maybe where I'm going wrong.

My seed data amounts to about 10k user documents which is where the g, lon, lat data is held amongst other user-related data i.e gender, bio, display name, age...

At the moment a simple proximity query that yields about 2000 results is taking ~20s

I'm wondering if maybe because my geolocation data is amongst other user-data that maybe that is hindering the query speed? I am tempted to try putting my geolocation data in a sub collection of user and do a collectionGroupquery instead and then get the parent from those results... I haven't tried it yet but just thought I'd check before I waste my time.

query

const userDB: GeoCollectionReference = geofirestore.collection('users');
const myGeolocation = await userDB.doc(ME).get();
const MyUserData: GeoFirestoreTypes.GeoDocumentData|undefined = myGeolocation.data();
const coordinates: admin.firestore.GeoPoint|undefined = MyUserData?.coordinates;

perf.start('userQuery');
const usersQuery = await (
    userDB
        .near({
            center: coordinates,
            radius: radius,
        })
        .where('completedOnboarding', '==', true)
).limit(2000).get();
const userQueryTime = perf.stop('userQuery'); // ~15-20s

User data shape

{
    age: number,
    city: string,
    completedOnboarding: boolean,
    coordinates: GeoPoint,
    displayName: string,
    dob: Date,
    email: string,
    g: {
        geohash: 'gcp46rhxs7',
        geopoint: GeoPoint,
    },
    gallery: string[],
    gender: string,
    lat: number,
    lon: number,
    orientation: string,
    photoURL: string,
    sports: string[],
    termsAndConditions: boolean,
    uid: string,
}

benchmarks

No. Results Time took
2000 22.101284382 s
2000 22.544957726 s
2000 21.331829744 s
2000 21.573683267 s
2000 21.901356968 s
2000 20.484823754 s
2000 22.350238785 s
2000 23.45303113 s
2000 26.553158478 s
2000 25.363396048 s

Cheers!

So using get means that the library needs to to resolve around 8 queries on average (we need to multiple queries to cover the desired area of your geoquery with the geohash) . Then with your limit the library applies the limit to each of the queries and then sorts all of them when the queries resolves and limits them locally.

So in order to make 1 geoquery with a limit you're in fact resolving 8ish queries and then a sort. I'd recommend if you need this to run faster it would be best to use onSnapshot, as it should update the data as it comes in rather than waiting for everything to resolve first.

Hey @francisleigh unless you have anything else to add here I'll close this issue in about 24 hours.

@MichaelSolati hey Michael sorry for the late reply, I appreciate you help.

In the end the only thing that helped was upping my memory allocation for the Firebase function to 8GB which brought the time down to ~3.8s

Still not completely ideal but it's ok for now.

I also removed the limit as a limit is not actually ideal for my algorithm.

Really appreciate your help!

Out of interest, would you expect the radius search to execute faster than what I'm experiencing? Again, the only thing that made an impact at all was the memory allocation. Using onSnapshot made very little difference, especially as I removed the limit.

Cheers!

I'm genuinely unsure, I don't often run this lib on firebase functions, but I know some of the base tier products aren't that fast... (I've run into issues with Cloud Build being kinda slow for building Angular sites). Uhm, separately, you could always do the limit on client. The object that a query returns also includes the distance from the origin. If you add that distance to each element you return to the client from the function you can sort and limit them on client instead of on the cloud function (this is basically what limit does, but if the Cloud Function is running slow then hopefully the user's device would be faster).

Are you doing some native app development?

@MichaelSolati Yes, React Native.

I did run the query on the client but it took ages and actually crashed my app 🤣

On the browser it's fine but yes on Native Mobile and/or Firebase function, the filtering seems a bit slow.

Genuinely appreciate this library and your feedback!

Looking forward to Firebase maybe addressing this issue. Until then I'm still going to use this lib until I can develop my own Backend.

Sure, sorry I couldn't be of more help though. Best of luck on your app!