CSV max size OverflowError on Windows
soof-golan opened this issue · 4 comments
I've tried to import redisgraph_bulk_loader.bulk_insert but failed with the following error
File ~\dev\redis-graph-poc\venv\lib\site-packages\redisgraph_bulk_loader\entity_file.py:11, in <module>
8 from enum import Enum
9 from exceptions import CSVError, SchemaError
---> 11 csv.field_size_limit(sys.maxsize) # Don't limit the size of user input fields.
14 class Type(Enum):
15 UNKNOWN = 0
OverflowError: Python int too large to convert to C long
System:
- x86_64 Windows 10
- Python3.9
@soof-golan How many rows is your spreadsheet? Also - how many columns? A ballpark is fine!
Haven't even loaded a CSV, python just fails on import time because of sys.maxsize and the csv module
plan is to ingest approx 200M nodes and approx 2B edges
Same issue here. Throws this even without specifying any arguments. Throws the same exception with arguments, even with a modest 12mb file.
【environment】
- 64-bit operating system, x64-based processor
- windows 10 home
- Python3.9.7
【Conclusion】
It worked if I commented out csv.field_size_limit(sys.maxsize) or changed it to csv.field_size_limit(2147483647).
Regarding sys.maxsize,
sys.maxsize = 2**31-1 on Linux in a 32bit environment, so I think it will work.
However, on Windows in a 64bit environment, sys.maxsize = 2**63-1, so it was an OverflowError: Python int too large to convert to C long error.
I felt that the maximum field size that can be newly set in csv.field_size_limit () is 2**31-1 in the current specifications.
Thank you