scylladb/scylla-rust-driver

Is it possible to use the new `SerializeRow` without using a generic type?

Jasperav opened this issue · 9 comments

I am trying to convert Catalytic to the new API's but I am stuck while transforming these structs: https://github.com/Jasperav/Catalytic/blob/6b74c52a4926d5aee7034a2b545befcbb61b08b2/catalytic/src/query_transform.rs#L202:

pub struct Qv<R: AsRef<str> = &'static str, V: ValueList = SerializedValues> {
    pub query: R,
    pub values: V,
}

Example impl here https://github.com/Jasperav/Catalytic/blob/6b74c52a4926d5aee7034a2b545befcbb61b08b2/catalytic_table_to_struct/example/src/generated/person.rs#L170:

pub fn insert_qv(&self) -> Result<Insert, SerializeValuesError> {
    let mut serialized = SerializedValues::new();
    serialized.add_value(&self.name)?;
    serialized.add_value(&self.age)?;
    serialized.add_value(&self.email)?;
    serialized.add_value(&self.row_type)?;
    Ok(Insert::new(Qv {
        query: INSERT_QUERY,
        values: serialized,
    }))
}

So previously I could use a non-generic type as placeholder for the Qv struct in which users could safely execute predefined queries with values. Now, I don't see an impl for SerializeRow with a concrete-non-generic struct (expect for the Legacy one) which I can use.

Is there any 'unsafe' not-generic struct which impls SerializeRow which I can use as a typeholder for Qv? It was really easy passing around the Qv struct in my applications and just execute whenever I want without using any generic types. Looks like the only way now is working with the Legacy struct but maybe there is a better way. I don't feel like it's a good idea to make separate structs for every single query so you get 1 Qv for each method.

Is there any 'unsafe' not-generic struct which impls SerializeRow which I can use as a typeholder for Qv?

I suppose the closest thing would be Box<dyn SerializeRow>. This way you can achieve type erasure and also have (CQL) type safety. Would that work for your case?

@piodul can multiple values serialize itself to a single SerializeRow? Then maybe it could work. But using a Box requires heap allocations… I don’t need type safety since Catalytic auto generates structs from the databas so it looks like a waste to me

@piodul can multiple values serialize itself to a single SerializeRow? Then maybe it could work.

Yes. SerializeRow is like the old ValueList. We implemented it for the same set of types as the old trait (e.g. tuples, Vec, HashMap, etc. - not counting the impls that users wrote in their own projects, of course).

But using a Box requires heap allocations…

Creating a SerializedValues also requires an allocation - it keeps the serialized bytes in a Vec. To be fair, however, if you passed LegacySerializedValues to the old driver, it would skip serializing the values for the second time and an allocation would be avoided - so there should be an additional allocation with Box vs. LegacySerializedValues.

I don’t need type safety since Catalytic auto generates structs from the databas so it looks like a waste to me

Type checking happens in SerializeCql::serialize - so, during serialization. Currently, it cannot be skipped. I have some ideas on how we can allow skipping them in the future, but right now it is mandatory.

In any case, you need to pass the type of the value during serialization, and this requirement won't go away in the future.

I suppose if you wanted to rewrite your code in the most straightforward way, you could do it like this:

pub fn insert_qv(&self) -> Result<Insert, SerializeValuesError> {
    let mut serialized = SerializedValues::new();
    serialized.add_value(&self.name, &ColumnType::Text)?; // <- note that you need to pass the type
    serialized.add_value(&self.age, &ColumnType::Int)?;
    serialized.add_value(&self.email, &ColumnType::Text)?;
    serialized.add_value(&self.row_type, &ColumnType::Text)?;
    Ok(Insert::new(Qv {
        query: INSERT_QUERY,
        values: serialized,
    }))
}

Then, the last problem to be solved would be to pass values to execute. I'd rather not implement SerializeRow on SerializedValues in order not to make it too easy to pass the incorrect types, but perhaps we could introduce a wrapper, e.g. pub struct WithBypassedTypeChecks<T>(pub T) to make it very explicit from the name.

I'm not sure, however, how we would go about passing SerializedValues efficiently without having to rewrite them at some point. Perhaps we could introduce the serialized method to the SerializeRow trait:

fn serialized(&self, ctx: &RowSerializationContext<'_>) -> SerializedResult<'_>;

... but that's adding a trait method just to make a single optimization possible. Maybe it's not too bad, but if other solutions are possible then I'd rather avoid complicating the traits' interface.

Adding an escape like you mention for WithBypassedTypeChecks would be useful. So to wrap your comment up: there is no (easy) way of executing directly SerializedValues? Should I wait for converting Catalytic to the new version?

there is no (easy) way of executing directly SerializedValues?

No, there isn't at the moment.

Should I wait for converting Catalytic to the new version?

Probably, yes. It should be possible to add the changes I described in a backwards-compatible way, so perhaps we will release 0.11.1.

Ok thanks for the quick reply, I will keep an eye out for the 0.11.1 release

@piodul can multiple values serialize itself to a single SerializeRow? Then maybe it could work. But using a Box requires heap allocations… I don’t need type safety since Catalytic auto generates structs from the databas so it looks like a waste to me

This is only one additional heap allocation - driver already does several of them for each request + actually sending and receiving data trough network. Worrying about this may potentially be a premature optimization.
It is possible that in the future we will be able to serialize directly into request buffer - which would reduce total allocation count.

Adding an escape like you mention for WithBypassedTypeChecks would be useful. So to wrap your comment up: there is no (easy) way of executing directly SerializedValues? Should I wait for converting Catalytic to the new version?

I'd be a bit reluctant to add such a wrapper to our driver:

  1. Type safety is important, even in ORM (ALTERs exist), I don't want users to disable it too easily.
  2. If someone wants to disable it, it shouldn't be a problem for a struct allowing disabling type checks during query execution to be implemented outside the driver (e.g. by you in Catalytic), as a wrapper around SerializedValues. Take a look at how our helper structs / traits / macros that aid conversion from 0.10 are implemented, it should be something similar.

Imho implementing such a struct in user code is a good enough bar for disabling type checks - if someone really wants it, it's not a problem, but it will force developer to think twice before doing it.

I'd recommend you to go with Box<dyn SerializeRow> (or &dyn SerializeRow if you are OK with adding lifetime to this struct). If you're still worried about this one additional allocation, then implementing this wrapper around SerializedValues may be the way to go.

@Lorak-mmk I am trying to convert the code to scylla 1.11.0 but I am still struggling at dynamically serialize rows and then execute it some time later. I need this for multiple places, e.g. dynamic queries. The insert_qv in my original question is an example, although I can add a temp struct to it but that will add a lot of structs for each type (more on that at the bottom). I also need some help for this:

pub fn test_query(query: impl AsRef<str>) -> QueryMetadata {
    let query = query.as_ref();
    let qmd = extract_query_meta_data(query);
    let mut values = SerializedValues::new();

    for parameterized_column_type in &qmd.parameterized_columns_types {
        add_random_value(&mut values, parameterized_column_type);
    }

    // Execute the query with test values
    if let Err(e) = block_on(GLOBAL_CONNECTION.query(query, values.clone())) { // <-- Problem here since SerializedValues doesn't impl SerializeRow

I have a query and I want to execute it with a random value based on the column type. How can I do this? Using a Box is ok, but how would I go from SerializedValues to SerializeRow?

It seems like the derive macro SerializeRow will add the serialization to the current struct which makes sense, but it doesn't give me a guideline how to implement it on my own without using separate structs everywhere (see how many times qv is used here: https://github.com/Jasperav/Catalytic/blob/master/catalytic_table_to_struct/example/src/generated/child.rs.

It seems I have to introduce a separate struct for serializing values. If I can get some help converting SerializedValues to SerializeRow, or know how I can construct a dyn SerializeRow based on query values without adding structs everywhere, I would be happy :). I can also add an 'unsafe' way in Catalytic if it isn't desirable in this crate to go from SerializedValues to SerializeRow. Of course type checking is good but for Catalytic it isn't really needed since it just generates everything right out of the database, so I can use some escape hatches to make things easier.

It now seems like the LegacySerializedValues does exactly what I want: non-generic concrete type which implements SerializeRow. But I don't want to use legacy stuff for obvious reasons.