ranking-agent/strider

Slow message merging

Closed this issue · 1 comments

Some queries end up with really slow message merging. Here is an example:

{
    "message": {
        "query_graph": {
            "edges": {
                "e00": {
                    "object": "n01",
                    "subject": "n00"
                }
            },
            "nodes": {
                "n00": {
                    "categories": [
                        "biolink:Disease"
                    ],
                    "ids": [
                        "MONDO:0008692"
                    ]
                },
                "n01": {
                    "categories": [
                        "biolink:BiologicalEntity"
                    ]
                }
            }
        }
    },
    "log_level": "INFO"
}

This issue is now being handled by smarter merging in reasoner-pydantic models. See this PR.

Current results indicate this is working very well. For merging many copies of identical messages we see significant speed ups that carry forward when there are many results and many attributes.

Benchmark Name Dict-based Merging Time (s) In-place Merging Time (s) Total Input Size (MB) Total Output Size (MB)
1k Results - Small 5.96 0.31 34.30 0.03
100 Results - Large 14.01 0.62 49.27 0.49
1k Results - Small - Many Attributes 278.55 2.00 1696.78 1.70