fails to commit large batch inserts
gomesian opened this issue · 2 comments
Hi Team,
Was trying to understand the limits of a batch inserts with this library.
Seams somewhere between 6-8k nodes (3 to 4k edges) fails with:
I don't mind breaking up and batching updates of ~10k required node/edges every hour, but need help understanding what is breaking here - so I can configure a safe batch size based on varying property sizes for insert.
Also wondering the use of connection_pool, and/or maybe I should try unix_socket_path since script runs on server. maybe the use of this will allow great bath updates. I don't see documentation on this.
note: I tried, and the the redisgraph-bulk-loader.py method doesn't suit me either - as I need to constantly update and prune existing graph.
Any help / pointers appreciated!
quick and dirty example:
r = redis.Redis(host=HOST, port=PORT, db=0, socket_timeout=3000, connection_pool=None, charset='utf-8', errors='strict', unix_socket_path=None)
redis_graph = Graph('large', r)
for x in range(4000):
label1= 'src-' + str(x)
label2= 'dst-' + str(x)
label1 = Node(label='person', properties={'name': label1, 'age': 33, 'gender': 'male', 'status': 'single'})
redis_graph.add_node(label1)
label2 = Node(label='person', properties={'name': label2, 'age': 33, 'gender': 'male', 'status': 'single'})
redis_graph.add_node(label2)
edge = Edge(label1, 'visited', label2, properties={'purpose': 'pleasure'})
redis_graph.add_edge(edge)
redis_graph.commit()
Hi @gomesian,
This error is caused by a buffer size limit in RedisGraph's parser utility. A workaround can be found here - RedisGraph/RedisGraph#1486 (comment) . Alternately, you can create entities in a series of smaller batched queries by periodically making calls to redis_graph.flush() in your create loop.
Thanks, batching and flush() makes sense then
