meilisearch/meilisearch-rust

Quotation marks in primary silently fails entire `index.add_document` call.

gibbz00 opened this issue · 3 comments

Source

// lib.rs
#![allow(dead_code)]

use meilisearch_sdk::{indexes::Index, Client};
use serde::Serialize;

#[derive(Serialize)]
struct SimpleStruct<'a> {
    id: &'a str,
    value: usize,
}

const NO_QUOTE: SimpleStruct = SimpleStruct {
    id: "test",
    value: 1,
};
const WITH_QUOTE: SimpleStruct = SimpleStruct {
    id: "te'st",
    value: 2,
};

#[tokio::test]
async fn independently() {
    let index = index("test_1");
    add_documents(&index, &[NO_QUOTE]).await;
    assert_eq!(1, get_document_count(&index).await); // passed

    add_documents(&index, &[WITH_QUOTE]).await;
    assert_eq!(2, get_document_count(&index).await); // failed
}

#[tokio::test]
async fn together() {
    let index = index("test_2");
    add_documents(&index, &[NO_QUOTE, WITH_QUOTE]).await;
    assert!(get_document_count(&index).await > 0) // failed
}

fn index(namespace: &str) -> Index {
    Client::new(
        "http://127.0.0.1:7700",
        Some("xxx"),
    )
    .index(namespace)
}

async fn add_documents(index: &Index, simple_struct: &[SimpleStruct<'_>]) {
    index
        .add_documents(simple_struct, None)
        .await
        .unwrap()
        .wait_for_completion(&index.client, None, None)
        .await
        .unwrap();
}

async fn get_document_count(index: &Index) -> usize {
    index.get_stats().await.unwrap().number_of_documents
}
# Cargo.toml
[package]
name = "meilisearch_pk"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
meilisearch-sdk = "0.24"

Cargo test

[~/tests/meilisearch_pk] $ cargo test
   Compiling meilisearch_pk v0.1.0 (/home/gibbz/tests/meilisearch_pk)
    Finished test [unoptimized + debuginfo] target(s) in 2.03s
     Running unittests src/lib.rs (target/debug/deps/meilisearch_pk-5f2bfef2460e87ff)

running 2 tests
test together ... FAILED
test independently ... FAILED

failures:

---- together stdout ----
thread 'together' panicked at 'assertion failed: get_document_count(&index).await > 0', src/lib.rs:35:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- independently stdout ----
thread 'independently' panicked at 'assertion failed: `(left == right)`
  left: `2`,
 right: `1`', src/lib.rs:28:5


failures:
    independently
    together

test result: FAILED. 0 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.12s

Log

[2023-08-17T14:56:41Z INFO  actix_web::middleware::logger] 127.0.0.1 "POST /indexes/test_2/documents HTTP/1.1" 202 136 "-" "Meilisearch Rust (v0.24.1)" 0.000683
[2023-08-17T14:56:41Z INFO  actix_web::middleware::logger] 127.0.0.1 "POST /indexes/test_1/documents HTTP/1.1" 202 136 "-" "Meilisearch Rust (v0.24.1)" 0.000921
[2023-08-17T14:56:41Z INFO  milli::update::index_documents::enrich] Primary key was not specified in index. Inferred to 'id'
[2023-08-17T14:56:41Z INFO  index_scheduler] A batch of tasks was successfully completed.
[2023-08-17T14:56:41Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1112 HTTP/1.1" 200 302 "-" "Meilisearch Rust (v0.24.1)" 0.000268
[2023-08-17T14:56:41Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1113 HTTP/1.1" 200 302 "-" "Meilisearch Rust (v0.24.1)" 0.000114
[2023-08-17T14:56:41Z INFO  index_scheduler::batch] document addition done: DocumentAdditionResult { indexed_documents: 1, number_of_documents: 1 }
[2023-08-17T14:56:41Z INFO  index_scheduler] A batch of tasks was successfully completed.
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1112 HTTP/1.1" 200 652 "-" "Meilisearch Rust (v0.24.1)" 0.000232
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1113 HTTP/1.1" 200 338 "-" "Meilisearch Rust (v0.24.1)" 0.000134
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1112 HTTP/1.1" 200 652 "-" "Meilisearch Rust (v0.24.1)" 0.000207
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1113 HTTP/1.1" 200 338 "-" "Meilisearch Rust (v0.24.1)" 0.000434
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /indexes/test_2/stats HTTP/1.1" 200 65 "-" "Meilisearch Rust (v0.24.1)" 0.000091
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /indexes/test_1/stats HTTP/1.1" 200 81 "-" "Meilisearch Rust (v0.24.1)" 0.000280
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "POST /indexes/test_1/documents HTTP/1.1" 202 136 "-" "Meilisearch Rust (v0.24.1)" 0.002155
[2023-08-17T14:56:42Z INFO  index_scheduler] A batch of tasks was successfully completed.
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1114 HTTP/1.1" 200 650 "-" "Meilisearch Rust (v0.24.1)" 0.000207
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /tasks/1114 HTTP/1.1" 200 650 "-" "Meilisearch Rust (v0.24.1)" 0.000151
[2023-08-17T14:56:42Z INFO  actix_web::middleware::logger] 127.0.0.1 "GET /indexes/test_1/stats HTTP/1.1" 200 81 "-" "Meilisearch Rust (v0.24.1)" 0.000094

Notice how A batch of tasks was successfully completed. is not preceded by document addition done when the batch contained a document with a single quote in the primary key.

Expected behavior

Call to index.add_document should return and invalid_document_id error, preferably explaining that the single quote is an invalid error.

Hey @gibbz00 👋

I moved your issue to the appropriate GitHub repository as it is related to the meilisearch-rust SDK. We indeed use synchronous and asynchronous operations on the Meilisearch side and it should indeed return primary key issues when it is returned synchronously. However, I think the primary key issues are asynchronous and you should probably verify that the task has been processed in the tasks queue first.

Hii :)

I moved your issue to the appropriate GitHub repository as it is related to the meilisearch-rust SDK.

Awesome, thanks for picking this up.

However, I think the primary key issues are asynchronous and you should probably verify that the task has been processed in the tasks queue first.

I did, I'm using wait_for_completion in the given code example, and the logs confirm this usage with A batch of tasks was successfully completed.

Hello @gibbz00
thanks for the report
I tagged it as bug. PR are welcome to fix it 😊