Performance issues related to complete_value on large datasets
JCatrielLopez opened this issue · 3 comments
JCatrielLopez commented
Hi! We've noticed that returning a list of 5k elements, with a couple of nested objects is pretty slow:
Person {
id
name
lastname
age
address {street number}
job {id org_name}
partner {id name}
pets {name type}
school {id name}
}
ncalls | tottime | percall | cumtime | percall | filename:lineno(function) |
---|---|---|---|---|---|
1 | 1.8e-05 | 1.8e-05 | 2.145 | 2.145 | graphql.py:103(graphql_sync) |
1 | 1.5e-05 | 1.5e-05 | 2.145 | 2.145 | graphql.py:152(graphql_impl) |
1/30001 | 0.19 | 6.335e-06 | 2.137 | 7.122e-05 | execute.py:413(ExecutionContext.execute_fields) |
1 | 1.3e-05 | 1.3e-05 | 2.137 | 2.137 | execute.py:965(execute) |
1 | 7e-06 | 7e-06 | 2.137 | 2.137 | execute.py:328(ExecutionContext.execute_operation) |
1/135001 | 0.3229 | 2.392e-06 | 2.135 | 1.581e-05 | execute.py:485(ExecutionContext.execute_field) |
1/145001 | 0.2737 | 1.888e-06 | 2.071 | 1.428e-05 | execute.py:575(ExecutionContext.complete_value) |
1 | 0.009884 | 0.009884 | 2.071 | 2.071 | execute.py:660(ExecutionContext.complete_list_value) |
5000/30000 | 0.02747 | 9.156e-07 | 2.026 | 6.752e-05 | execute.py:893(ExecutionContext.complete_object_value) |
By itself it's not really a slow function, but its executed 30k times. Is there any way to reduce the overhead by reducing the number of times this function is invoked?
Tested on Python 3.8 and graphql-core==3.2.3
JCatrielLopez commented
Possibly related to this graphql-js issue
Cito commented
Thanks for reporting. Will look into this when I have more time, probably only after releasing 3.3. It would be helpful if you could post example code with dummy data to reproduce this.
JCatrielLopez commented
schema.graphql:
type Query {
persons: [Person]
}
type Person {
id: String!
name: String
ssn: String
alive: Boolean
has_job: Boolean
job: JobDetails
address: Address
pets: Address
house: House
partner: Person
}
type JobDetails {
id: String
name: String
}
type Address {
id: String
name: String
}
type Pets {
id: String
name: String
race: String
color: String
}
type House {
color: String
floors: Int
is_duplex: Boolean
is_apt: Boolean
}
server.py:
import random
import string
import sys
import yappi
from graphql import graphql_sync, build_ast_schema
from graphql.language.parser import parse
yappi.set_clock_type("wall")
with open("./schema.graphql", "r") as f:
schema = build_ast_schema(parse(f.read()))
class Query:
"""The root resolvers"""
def persons(self, info):
output = []
for _ in range(5_000):
output.append(
dict(
id="".join(random.choices(string.ascii_lowercase + string.digits, k=9)),
name=f"John Doe",
ssn="00000000000000000",
alive=True,
has_job=False,
job=dict(id="xxx", name="test"),
address=dict(id="yyy", name="Fake Street"),
pets=dict(id="zzz", name="test"),
house=dict(
color="RED",
floors=2,
is_duplex=False
),
partner=dict(id="".join(random.choices(string.ascii_lowercase + string.digits, k=9)), name="test"),
)
)
return output
def main():
query = """{
persons{
id
name
alive
has_job
job{id name}
partner{id name}
address{id name}
pets{id name}
house{color floors is_duplex}
}
}"""
yappi.start()
result = graphql_sync(schema, query, Query())
yappi.stop()
if result.errors:
print(result)
sys.exit(1)
yappi.get_func_stats().save("profile", type="pstat")
# To visualize profile:
# python -m snakeviz profile --server
if __name__ == '__main__':
main()