lambdaclass/cairo-vm

Why does Program do not implement `serde`?

tdelabro opened this issue · 18 comments

In vm/src/types/program.rs:

    pub fn serialize(&self) -> Result<Vec<u8>, ProgramError> {
        let program_serializer: ProgramSerializer = ProgramSerializer::from(self);
        let bytes: Vec<u8> = serde_json::to_vec(&program_serializer)?;
        Ok(bytes)
    }

    pub fn deserialize(
        program_serializer_bytes: &[u8],
        entrypoint: Option<&str>,
    ) -> Result<Program, ProgramError> {
        let program_serializer: ProgramSerializer =
            serde_json::from_slice(program_serializer_bytes)?;
        let program_json = ProgramJson::from(program_serializer);
        let program = parse_program_json(program_json, entrypoint)?;
        Ok(program)
    }

Program does not implement the traits Serialize/Deserialize, but instead those two methods who do the same thing, only in a way that is not compatible with the serde crate.
Why? I break upstream usage of the traits. If I have a struct that contains a Program (like blockifier::ContractClassV1Inner ), I cannot derive the correct trait onto it.

Why not just implement the trait?

Hi @tdelabro !
We've utilized the ProgramSerializer struct for serialization and deserialization of the Program struct, allowing for flexibility in future modifications. We're open to reconsidering this approach. Are you looking for the Program to specifically implement that trait, or is this just a preliminary inquiry?

Hi @pefontana,

I saw the ProgramSerializer and it's name let me understand that it may be related to my needs, but there is no doc on what it does and how to use it.

So, what we ended up doing is the following:
https://github.com/bidzyyys/blockifier/blob/feature/scale-codec/crates/blockifier/src/execution/contract_class.rs#L654-L681
and
https://github.com/bidzyyys/blockifier/blob/feature/scale-codec/crates/blockifier/src/execution/contract_class.rs#L145-L150

Adding our own custom de/serialization logic where we need it in the logic that uses your type, so that we can still implement Serialize and Deserialize on our type.

Are you looking for the Program to specifically implement that trait, or is this just a preliminary inquiry?

Yes, the Program type should implement Serialize and Deserialize so that lib users don't have to guess or hack their way through it.

@pefontana any news on this issue?

This is becoming a problem in other repositories: starkware-libs/blockifier#1937

I saw the ProgramSerializer and it's name let me understand that it may be related to my needs, but there is no doc on what it does and how to use it.

So the real issue is that this is not properly documented, am I correct?

So the real issue is that this is not properly documented, am I correct?

This is one thing.

The other is that it prevents anyone downstream from deriving Serde on their types that contain Program, which is a huge pain.
I see no reason why the trait would not be implemented, but maybe I'm missing something

So the real issue is that this is not properly documented, am I correct?

This is one thing.

The other is that it prevents anyone downstream from deriving Serde on their types that contain Program, which is a huge pain. I see no reason why the trait would not be implemented, but maybe I'm missing something

I believe it can be implemented, what we can not do is derive it and call it a day, as it implicitly converts private interfaces into public ones (specifically, derive(Serialize, Deserialize) forces JSON layout and in-memory layout to match. I'll try to look a bit into it.

We want serde implemented on Program so types that contains it can derive it on themselves without having to ask any question

The main reason for not implementing serde on Program is that we don't have a single deserialization implementation for Program itself.
When we deserialize a compiled cairo 0 program, using either Program::from_bytes or Program::from_file we convert the compiled program data to the format used by Program, losing the original information in the process.
When we then serialize Program, via serialize, we use a ProgramSerializer struct which is not equivalent to the original compiled cairo program.
We can also deserialize this ProgramSerializer via deserialize, which will net us the same Program struct we serialized via serialize.
The problem with this is that if we were to implement Deserialize relying in the compiled cairo program deserialization (aka Program::from_bytes), Serialize & Deserialize wouldn't be compatible, as deserializing a serialized program would fail.
If we were to choose the other route, and implement Deserialize relying on the ProgramSerializer deserialization, then Serialize & Deserialize would be compatible, but it wouldn't be useful in your case, as the deserialization would not work for cairo programs generated by the cairo 0 compiler

Thanks, @fmoletta for expressing clearly the issue here.
I think there is a straightforward solution here: creating different Struc.
You have your executable program, which impl Serde.
You have your DeprecatedProgram (or Cairo0Program, or whatever name), which impl Serde and into_executable_program.

Now if you have a Program, you can de/ser it freely and if you are dealing with a cairo0 file, you deserialize it into your intermediary struct that you then use to build an executable Program.

Note that you still won't be able to go from a Program to a Cairo0Program. Only the other way around.

I have understood the situation correctly and is this a viable answer to the problem?

Thanks, @fmoletta for expressing clearly the issue here. I think there is a straightforward solution here: creating different Struc. You have your executable program, which impl Serde. You have your DeprecatedProgram (or Cairo0Program, or whatever name), which impl Serde and into_executable_program.

Now if you have a Program, you can de/ser it freely and if you are dealing with a cairo0 file, you deserialize it into your intermediary struct that you then use to build an executable Program.

Note that you still won't be able to go from a Program to a Cairo0Program. Only the other way around.

I have understood the situation correctly and is this a viable answer to the problem?

If I understand correctly this would involve creating a DeprecatedProgram struct that has the exact same format as the compiled json file so that we can deserialize and then serialize it without losing information and then converting that to a Program. This would make our deserialization step much slower as we would need to first parse the file and then parse it again to deserialize fields such as the references in the reference manager which come in string format.

I think the idea allows for us (cairo-vm-cli) to just use a custom deserialization that goes directly to Program. The idea would be to additionally have this Cairo0Program that can be converted to Program but also embedded into other projects' structures and serialized and deserialized without information loss. Is this correct @tdelabro?

Another weird thing:
The ProgramJson contains the following fields:

#[derive(Serialize, Deserialize)]
pub struct Reference {
    ...
    #[serde(deserialize_with = "deserialize_value_address")]
    #[serde(rename(deserialize = "value"))]
    pub value_address: ValueAddress,
}

Where the struct impl both Serialize and Deserialize, but won't deserialized in the same way as it serialized. The struct
Making it unusable.

In the same idea both serde::deserialize::ProgramJson and serde::serialize::ProgramSerializer implement both ser and de.
It makes absolutely no sense.

My input would be to:

  • drop the whole custom deserializer thing you have
  • define a struct for each outside world struct you want to be able to load (DeprecatedProgram, Program) (or v0, v1) (or legacy, idk)
  • implement a two-way serde for those (you should be able to call serialize then deserialize without losing data)
  • impl From<ThoseClassesYouJustCreated> for ExecutableProgram (ExecutableProgram being what is currently named Program)

Then if someone still needs it for some reason, we can easily derive Serde onto the ExecutableProgram while making it clear that this is not the format outputted by the compiler.

I think structs containing a 1 to 1 version of the compiled cairo programs should be standarized (like contract classes in starknet-api) so users don't have to depend on this crate for it

@orizi ⬆️

@orizi ⬆️

we would also like to be able to stop using forks of blockifier, they're very hard to maintain over time

@pefontana @Oppen @fmoletta Did you decide on a design? What is the ETA?