ClickHouse/clickhouse-rs

The fetch API is unsound if used with borrowed types

Opened this issue · 10 comments

loyd commented

The current API allows the following code:

let mut cursor = client.query("...").fetch::<MyRow<'_>>()?;
let a = cursor.next().await?;
let b = cursor.next().await?; // <- must be error

We should use something like sqlx::fetch or wait for GAT stabilization.

Sadly, it will be a breaking change.

Is the issue that you are reusing a temporary and returning a reference to that temporary?

Is there a reason you couldn't change from

pub async fn next<'a, 'b: 'a>(&'a mut self) -> Result<Option<T>> where
    T: Deserialize<'b>,

to

pub async fn next<'a>(&'a mut self) -> Result<Option<T>> where
    T: Deserialize<'a>,

Yes, this would be a breaking change, but doesn't seem like it would be productively complicated.

loyd commented

Sadly, it's not enough:

  --> examples/usage.rs:69:27
   |
69 |     while let Some(row) = cursor.next().await? {
   |                           ^^^^^^^^^^^^^
   |                           |
   |                           `cursor` was mutably borrowed here in the previous iteration of the loop
   |                           first borrow used here, in later iteration of loop

@serprex, can you please have a look?

looking over, see issue ends up rooted in workaround_51132, will see if I can find a breaking change that's acceptable

Continued thinking on this. Potential ideas: store bytes in some bag datastructure that gets dropped with cursor. Problem: cursor memory usage grows with number of rows. Alternative: cursor returns object which owns bytes & has method to get row object. Problem: fetch methods don't want to pass around long lived metaobject per row

All this bytes management, when in read_slice we're doing a copy anyways, seems best to use DeserializeOwned & not support borrowed struct fields for now

loyd commented

I'm sure we should support both patterns. I have a borrowed usage (with serde_json::RawValue for some message dumps) where this pattern is a game changer.

One possible solution is to use GATs (v1.65+ so it's ok for us): https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a0f3efd735703389898ac9319c7d834a

However, this makes it non-trivial to write generic code:

pub async fn generic_borrowed_usage<T: Row>(client: Client, f: impl Fn(T)) {
    let mut cursor = client.fetch::<T>();
    
    while let Some(row) = cursor.next().await {
        f(row);
    }
}

It fails to compile with

error[E0308]: mismatched types
  --> src/main.rs:59:11
   |
55 | pub async fn generic_borrowed_usage<T: Row>(client: Client, f: impl Fn(T)) {
   |                                     - expected this type parameter
...
59 |         f(row);
   |         - ^^^ expected type parameter `T`, found associated type
   |         |
   |         arguments to this function are incorrect
   |
   = note: expected type parameter `T`
             found associated type `<T as Row>::Value<'_>`

That can be fixed with the following signature:

pub async fn generic_borrowed_usage<T: Row>(client: Client, f: impl Fn(T::Value<'_>)) { .. }

But it looks dirty =(

Looks like we're actually dealing with PhantomData issue, not with workaround_51132. The signature async fn next<'a, 'b: 'a>(&'a mut self) -> Result<Option<T>> where T: Deserialize<'b> { .. } is not correct in the first place, since 'b (the data in buffer) definitely must not outlive 'a (the buffer).

From the compiler's point of view, RowCursor stores object of type MyRow<'1>, and when we call async fn next<'a>(&'a mut self) -> Result<Option<T>> where T: Deserialize<'a> { .. } we're actually saying "hey, '1: 'a", thus, we borrow cursor for its entire lifetime.

GAT workaround here works by not linking lifetime inside phantom data (the type inside would probably get something like 'static lifetime) and the &mut self lifetime, we either should get rid of PhantomData. As an example, this works perfectly fine - https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c2d898a807c1d8f2210d31b416270234

There could be at least two hacky fixes:

  1. Redesign API so borrowing RowCursor as mutable is not required (we'll borrow it immutably on every next call, the problem is still here, but since we can have as many immutable borrows as we want - no problem here), so next will probably consume some external type on which mutations are made
  2. GATs introduced earlier

I think there must be something better and non-hacky, since both solutions kinda suck

loyd commented

With introducing the additional helper RowOwned (thx @nerodono) (akin to DeserializeOwned), the generic code for owned usage can be simplified too:

pub trait RowOwned: 'static + DeserializeOwned + for<'a> Row<Value<'a> = Self> {}
impl<R> RowOwned for R where R: 'static + DeserializeOwned + for<'a> Row<Value<'a> = R> {}
// rustc doesn't imply DeserializeOwned from for<'a> Row<value<'a> = R> somehow, so we add it directly

pub async fn generic_owned_usage<T: RowOwned>(client: Client, f: impl Fn(T)) {
    let mut cursor = client.fetch::<T>();
    
    while let Some(row) = cursor.next().await {
        f(row);
    }
    
    let one = cursor.next().await.unwrap();
    let two = cursor.next().await.unwrap();
    f(one);
    f(two);
}

The whole GAT-based solution:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c21c7533d08e14826fda1e9047f3102c

So, borrowed generic usage is still "dirty" but it's rare and can be easily documented.

nvm, the first one will not work, since we'll borrow as mutable the external type, so no difference here