Galbar/JsonPath-PHP

Filtering with multiple conditions takes a long time to execute.

Closed this issue · 6 comments

Hello,

I am facing performance issue when using your library in my symfony project.
I have a JSON file containing a multidimensional structure and manage to use your library to query this.

I use a query with multiple conditions.

However the execution time is quicker when using my MySQL db compared to using the JSON file.
I have joined the JSON file containing the data and its structure, and a PHP test command file so that you can have a better understanding of what is executed.

Is there something wrong in my way of using it ?

Many thanks,
Daniel

json.txt

TestPerfCommand.txt

Hi @DVelbeck !

IMHO, MySQL will always be faster for complex queries - especially with range queries like yours - simply because it is a database server that is optimised to work with indexes of tables, can compute quite complex execution plans for your queries and its compiled code rather than a scripting language like PHP. On top of that, being a "smart" server (stateful rather than stateless like a PHP script), it will cache query results to optimise subsequent queries.

I would suggest using a SQL server in general if you have to compute complex queries, and use indexes on your tables to speed up execution of your queries. Use JSON and JSONPath for more "static", precomputed data where you don't have to run complex queries (especially range queries with >,>=, <, <=).

Regards,

Claudiu

Hi Claudiu,

Many thanks for answering.

I do use MySQL for complex queries. However, I am using a data storage as JSON files in order to keep historical data and changes.

That was why I have been willing to use JsonPATH, especially the fact that the data is pre-formated, and ready to use, whereas MySQL you need to make joins.

Hi Daniel!

You know better your infrastructure and requirements, but historical data can be stored in SQL as well, maybe in different tables or a different database altogether.
Just keep in mind that the bigger your JSON, the less optimal the JSONPath queries will be due to lack of indexes and complex query plans. Your whole JSON will be stored in memory, as opposed to disk, there will be no smart b-tree indexes, and the execution time will increase at least linearly if not worse.

Regards,

Claudiu

Hi Daniel,

This library is not optimized for heavy and complex queries. It's main use case is for simple querying over small pre-computed data.

Cheers,
Alessio

Also, I suggest changing the query to:

$result = $jsonStore->get('$.data[?('
                    . '@.organisation.id >= 3 '
                    . 'and @.organisation.id <= 5 '
                    . 'and @.tier >= 1 and @.tier <= 3  '
                    . 'and  @.attributes[8].value.value == 34  or  @.attributes[8].value.value == 68  '
                    . 'and  @.tier == 2 '
                    . 'and  @.attributes[43].value.value == 1  '
                    . 'or  @.attributes[27].value.value == 1  '
                    . 'or  @.attributes[43].value.value == 2  '
                    . 'or  @.attributes[27].value.value == 2  '
                    . 'or  @.attributes[43].value.value == 3  '
                    . 'or  @.attributes[27].value.value == 3  '
                    . 'or  @.attributes[43].value.value == 4  '
                    . 'or  @.attributes[27].value.value == 4  '
                    . 'or  @.attributes[43].value.value == 5  '
                    . 'or  @.attributes[27].value.value == 5  '
                    . 'or  @.attributes[43].value.value == 6  '
                    . 'or  @.attributes[27].value.value == 6)]');

The change is at the beginning, I'm changing $..* to $.data. It takes about 1 second to run on my laptop with this change.

Think that when you query with $..* it will apply the query to all levels inside the json fiel

Hope that helps :)

Hi Claudiu, Alessio, Galbar,

This is awesome! By changing to the right node as Galbar advised, this made the trick !
I can close the issue.
Thanks a lot !