p2panda/aquadoggo

Incorrect number of documents returned from ordered and filtered collection query

sandreae opened this issue ยท 5 comments

Spotted this behaviour when working on the app and now was able to reproduce the issue in a test:

#[rstest]
fn paginated_query_with_ordering_and_filtering(key_pair: KeyPair) {
test_runner(|mut node: TestNode| async move {
async fn next_page(
client: &TestClient,
schema: &Schema,
first: usize,
end_cursor: Option<&String>,
filter_value: &String,
) -> (usize, usize, String, bool) {
let after = if let Some(end_cursor) = end_cursor {
format!("after: {end_cursor}, ")
} else {
String::new()
};
let data = query_lyrics(
client,
schema.id(),
&format!("(first: {first}, {after} orderBy: line, orderDirection: DESC, filter: {{ line: {{ contains: \"{filter_value}\" }} }})"),
)
.await;
let documents = data["query"]["documents"].as_array().unwrap().len();
let total_count = data["query"]["totalCount"].clone().as_i64().unwrap();
let end_cursor = data["query"]["endCursor"].clone().to_string();
let has_next_page = data["query"]["hasNextPage"].clone().as_bool().unwrap();
return (documents, total_count as usize, end_cursor, has_next_page);
}
// Publish some lyrics to the node.
let (lyric_schema, _) = here_be_some_lyrics(&mut node, &key_pair).await;
// Init a GraphQL client we'll use to query the node.
let client = http_test_client(&node).await;
// We're making paginated queries and filtering by a string value.
let page_size = 3;
let filter_value = String::from("m");
let mut acc = 0;
// Make the initial paginated request.
let (mut page_count, mut total_count, mut end_cursor, mut has_next_page) =
next_page(&client, &lyric_schema, page_size, None, &filter_value).await;
// Add the page count to the accumulator.
acc += page_count;
// Now while there is a next page we keep querying.
while has_next_page {
(page_count, total_count, end_cursor, has_next_page) = next_page(
&client,
&lyric_schema,
page_size,
Some(&end_cursor),
&filter_value,
)
.await;
// Add the page count to the accumulator.
acc += page_count;
}
// We should have received the same number of documents as the total count.
assert_eq!(acc, total_count);
})
}

The returned number of documents does not match the total_count. My anecdotal experience is that documents are missing which should be returned. If one adjusts the page size to include the entire collection then all documents are indeed returned, so it seems to be particular to the case where the collection is paginated over.

We have tests against this particular case (pagination + filter), must be something with exactly this query and shape of data ๐Ÿค”

Yeh, I believe when only querying one field of a document this behaviour doesn't occur. Also if there aren't more than 2 pages to be returned. I don't think it's a very niche error though, I noticed it in two different queries "in the wild" now.

I may have found the bug causing this issue. I have at least found a bug in the pagination logic, which causes duplicate documents to be returned from paginated collection requests. I think there may be cases where it also could cause fewer to be returned, haven't been able to observe that case in tests yet though.

If you look at these results from a pagination request for pages of 3 items (printed values are <DOCUMENT_ID> <CURSOR> <VALUE>):

/// PAGE 1 RESPONSE
"00209b88a128c327bd1178b70d33b6a69189a489c0ff28796e8f0afb43720d07b0dc" "2wivUjpaNHZxwFjh9oic6je1vexfrNzGK3VLmBFBduDX6WEEvmeEisVJoTEzQu8PiM1h3DdhBcnu74xnaMVL5n5d" "Thrash, me crush me, beat me till I fall"
"00204888e94743374740730693f39c778b95d03ce4f340f77dbd374587170ba12536" "22ahi6PbPDws8pRHDjYgyWRVcLV1G68vyjBFMc4wXtFxyVzL6Pj24QvQs1PJgm5BaKNLHS3WqGYPBks2QcNvrVqG" "This heaven gives me migraine"
"0020e9a0884fe15cf137d0bbfb14d590c81ec9e23566b0f0c53da53b937c856213a3" "21DaUA29YhfQZQsm8SR2RTDbquqKoXcJhcXJDdJb3di1KgPJaZipo5sSrQedsEH7LZcQzre7LJaoBsRjdEntswnx" "This heaven gives me migraine"

END CURSOR: "21DaUA29YhfQZQsm8SR2RTDbquqKoXcJhcXJDdJb3di1KgPJaZipo5sSrQedsEH7LZcQzre7LJaoBsRjdEntswnx"

/// PAGE 2 RESPONSE
"00204888e94743374740730693f39c778b95d03ce4f340f77dbd374587170ba12536" "22ahi6PbPDws8pRHDjYgyWRVcLV1G68vyjBFMc4wXtFxyVzL6Pj24QvQs1PJgm5BaKNLHS3WqGYPBks2QcNvrVqG" "This heaven gives me migraine"
"0020b7951d9e3364494ae2f112c9fd411151ae31d29524f5ffcedbcabeb1ecdf81eb" "21D3aVb21xLajurqh5wHuax7AD6T6D5HaTZ4kQkwV3Wtuoywr5BE7aqDfXMbvYQr4vFPAN3XtNauP3efq2jReQYh" "The problem of leisure"
"0020ffea1841d8a75ec453d82c9e3f7347699f6494016f55b0dbe9897c8f12672598" "zGQ5rzG4LDfCtL4ptEMpPLJt9wY9MRmxqipoGFCaMqQ5uErqqJJo7AVTgFeFkJveQR7kXCKteR4srzxAyrPdszf" "The problem of leisure"

END CURSOR: "zGQ5rzG4LDfCtL4ptEMpPLJt9wY9MRmxqipoGFCaMqQ5uErqqJJo7AVTgFeFkJveQR7kXCKteR4srzxAyrPdszf"

You can see that the document with cursor 22ah.. is returned twice, once in the first response, and once in the second. This happens when its value is the same as the document with cursor 21Da.. (this is the end cursor) and its cursor is > the end cursor, meaning it will be selected as the first document in the second query.

Looking at the code, we seem to be trying to account for this case here:

// Always order by document view id, except of if it was selected manually. We need this to
// assemble the rows to documents correctly at the end
let id_sql = match order.field {
Some(Field::Meta(MetaField::DocumentViewId)) => None,
_ => Some("documents.document_view_id ASC".to_string()),
};
// On top we sort always by the unique operation cursor in case the previous order value is
// equal between two rows
let cursor_sql = match list {
Some(_) => Some("operation_fields_v1_list.cursor ASC".to_string()),
None => Some("operation_fields_v1.cursor ASC".to_string()),
};

But actually, the cursor ordering won't make any difference as the document view id is different so this ordering will take precedent. If we don't order by document view id then this issue goes away, but that's not the behaviour we want (it causes errors elsewhere).

Haven't come up with a solution yet, just documenting findings ๐Ÿ‘

Haven't looked closely at this issue yet but my gut feeling tells me that there's something wrong with sorting. We kinda only want to sort by cursor, everything else (even document id and view id) should be optional and only done when selected by the user.

Yeh, that's the conclusion I came to as well. Think the only issue there is that we expect rows to be returned ordered by document id for when they are parsed into documents. So if we only order by cursor here, then we need to account for that when parsing.