CertainLach/jrsonnet

Memory Leak

expelledboy opened this issue · 5 comments

I am trying to use jrsonnet as lib in an HTTP wrapper to improve performance in a high TPS system.

However we have just deployed the service and its experiencing a memory leak.

image

This is the util module I created, and a test which demonstrates how I use the API.

extern crate log;

use log::info;

use jrsonnet_evaluator::apply_tla;
use jrsonnet_evaluator::trace::PathResolver;
use jrsonnet_evaluator::{function::TlaArg, gc::GcHashMap, FileImportResolver, IStr, State, Val};
use jrsonnet_parser::{ParserSettings, Source};
use jrsonnet_stdlib::ContextInitializer;

type Context = (State, Val);

pub fn init() -> Context {
    let state: State = State::default();

    state.set_import_resolver(FileImportResolver::default());

    let ctx = ContextInitializer::new(state.clone(), PathResolver::new_cwd_fallback());

    state.set_context_initializer(ctx);

    let decoder = std::path::Path::new("src/decode.jsonnet");

    let Ok(import) = state.import(decoder) else {
        panic!("failed to import decoder");
    };

    (state, import)
}

fn build_tla(version: String, variant: String, message: String) -> GcHashMap<IStr, TlaArg> {
    let message: IStr = message.clone().into();
    let source: IStr = "".into();
    let message = jrsonnet_parser::parse(
        &message,
        &ParserSettings {
            source: Source::new_virtual("<top-level-arg:message>".to_string().into(), source),
        },
    )
    .unwrap();

    let mut tla_args: GcHashMap<IStr, TlaArg> = GcHashMap::new();
    tla_args.insert("version".into(), TlaArg::String(version.into()));
    tla_args.insert("variant".into(), TlaArg::String(variant.into()));
    tla_args.insert("message".into(), TlaArg::Code(message));

    tla_args
}

pub fn decode(
    context: Context,
    version: String,
    variant: String,
    message: String,
) -> Result<serde_json::Value, String> {
    let start = std::time::Instant::now();
    let (state, decode) = context;
    let tla_args = build_tla(version, variant, message);

    let Ok(result) = apply_tla(state, &tla_args, decode) else {
        panic!("failed to apply tla");
    };

    let Val::Obj(_) = result else {
        panic!("expected object, invalid decoder");
    };

    info!("decoded in {:?}", start.elapsed());

    match serde_json::to_value(&result) {
        Ok(result) => Ok(result),
        Err(e) => Err(e.to_string()),
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_decode() {
        let version = "0.0.1".to_string();
        let variant = "iso_var_visa_base1".to_string();

        let message = serde_json::json!({
            "version": "v1",
            "request_message": {
                "mti": "100",
                "elements": {}
            }
        });

        let context = init();

        let result = decode(context, version, variant, message.to_string()).unwrap();

        let cmo = serde_json::json!({
            "message":{
                "mti":"0100",
                "raw":{"elements":{},"mti":"100"},
                "variant":"iso_var_visa_base1",
                "version":"0.0.1"
        });

        assert_eq!(result.to_string(), cmo.to_string());
    }
}

This gets called in the following tokio API handler:

    let post_decode = warp::path!("decode" / String)
        .and(warp::post())
        .and(warp::body::json())
        .map(|variant: String, raw: serde_json::Value| {
            let cmo_version: String = version().to_string();

            // TODO: Cache the context for faster decoding?
            let context = jsonnet::init();

            match jsonnet::decode(context, cmo_version, variant, raw.to_string()) {
                Ok(result) => warp::reply::json(&result),
                Err(e) => {
                    error!("ERROR: {}", e);
                    warp::reply::json(&serde_json::json!({
                        "error": e,
                    }))
                }
            }
        });

This is my first crack at writing a system in Rust. Perhaps I am doing something wrong. Either, there a memory leak in the code above, or in jrsonnet itself.

If I could clone context and supply it to the decode function I could perhaps circumvent the memory leak. However I am getting the follow error when trying to do so:

error[E0277]: `NonNull<jrsonnet_gcmodule::cc::RawCcBox<EvaluationStateInternals, jrsonnet_gcmodule::collect::ObjectSpace>>` cannot be shared between threads safely
   --> src/main.rs:83:33
    |
83  |     let commands = warp::post().and(post_decode);
    |                                 ^^^ `NonNull<jrsonnet_gcmodule::cc::RawCcBox<EvaluationStateInternals, jrsonnet_gcmodule::collect::ObjectSpace>>` cannot be shared between threads safely
    |
    = help: within `(jrsonnet_evaluator::State, jrsonnet_evaluator::Val)`, the trait `Sync` is not implemented for `NonNull<jrsonnet_gcmodule::cc::RawCcBox<EvaluationStateInternals, jrsonnet_gcmodule::collect::ObjectSpace>>`
    = help: the trait `warp::filter::FilterBase` is implemented for `warp::filter::map::Map<T, F>`
note: required because it appears within the type `RawCc<EvaluationStateInternals, ObjectSpace>`
   --> /Users/anthony/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/jrsonnet-gcmodule-0.3.6/src/cc.rs:80:12
    |
80  | pub struct RawCc<T: ?Sized, O: AbstractObjectSpace>(NonNull<RawCcBox<T, O>>);
    |            ^^^^^
note: required because it appears within the type `State`
   --> /Users/anthony/.cache/cargo/registry/src/index.crates.io-6f17d22bba15001f/jrsonnet-evaluator-0.5.0-pre95/src/lib.rs:262:12
    |
262 | pub struct State(Cc<EvaluationStateInternals>);
    |            ^^^^^
    = note: required because it appears within the type `(State, Val)`
    = note: required for `&(jrsonnet_evaluator::State, jrsonnet_evaluator::Val)` to implement `Send`

Is there any obvious issue with my code?

Or is there anyway I can clone the context?

Alternatively, if you could create a test release with #122 I could continue with my other method of calling jrsonnet as a binary

We have identified the leak comes from ContextInitializer::new

the actual line that leaks in the above code is this:

  let ctx = ContextInitializer::new(state.clone(), PathResolver::new_cwd_fallback());

I think there must be some sort of circular reference or something like that in this ContextInitializer code.

I think there must be some sort of circular reference or something like that in this ContextInitializer code.

Yes, there is, jrsonnet uses garbage collection for such links.
You can force garbage collection of current thread reference cycles using jrsonnet_gcmodule::collect_thread_cycles() call.

Regarding State is not Sync - jrsonnet internal structures aren't thread-safe (This isn't unique to jrsonnet, all jrsonnet implementations aren't thread-safe, even go-jsonnet: google/go-jsonnet#733), you can use I.e thread_local for this, but I would recommend you to start jrsonnet in separate process, especially if you plan to limit code in RAM/CPU usage.

Thanks! The process is run in a Thread by tokio, so I just needed to run jrsonnet_gcmodule::collect_thread_cycles(); and now we are not getting any memory leaks.

We really appreciate your rapid response! We pushing to production any day now, and this was a major issue. 😅