gbeced/basana

UTF-16 LE encoded CSV file under Windows 10 raises KeyError: 'volume'

femtotrader opened this issue · 4 comments

Hello,

Under Windows 10

python -m basana.external.bitstamp.tools.download_bars -c BTC/USD -p 1d -s 2014-01-01 -e 2021-01-31 > bitstamp_btcusd_day.csv

output a UTF-16 LE encoded CSV file

Running example raises

> python .\app.py
Traceback (most recent call last):
  File "C:\Users\femto\github\scls19fr\crypto-trading-bots\basana\app.py", line 111, in <module>
    asyncio.run(main())
  File "C:\Users\femto\anaconda3\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\asyncio\base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\Users\femto\github\scls19fr\crypto-trading-bots\basana\app.py", line 101, in main
    await event_dispatcher.run()
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\dispatcher.py", line 288, in run
    await super().run(stop_signals=stop_signals)
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\dispatcher.py", line 231, in run
    async with helpers.TaskGroup() as tg:
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\helpers.py", line 43, in __aexit__
    await asyncio.gather(*self._tasks)
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\dispatcher.py", line 298, in _dispatch_loop
    next_dt = self._event_mux.peek_next_event_dt()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\dispatcher.py", line 76, in peek_next_event_dt
    self._prefetch()
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\dispatcher.py", line 114, in _prefetch
    if event := source.pop():
                ^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\event_sources\csv.py", line 87, in pop
    ret = next(self._row_it)
          ^^^^^^^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\core\event_sources\csv.py", line 61, in load_and_yield
    for ev in row_parser.parse_row(row):
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\femto\anaconda3\Lib\site-packages\basana\external\common\csv\bars.py", line 38, in parse_row
    volume = Decimal(row_dict["volume"])
                     ~~~~~~~~^^^^^^^^^^
KeyError: 'volume'

because row_dict looks like

{'ÿþd\x00a\x00t\x00e\x00t\x00i\x00m\x00e\x00': '\x00', '\x00o\x00p\x00e\x00n\x00': None, '\x00h\x00i\x00g\x00h\x00': None, '\x00l\x00o\x00w\x00': None, '\x00c\x00l\x00o\x00s\x00e\x00': None, '\x00v\x00o\x00l\x00u\x00m\x00e\x00': None}

Saving to file should be left to the Python script and not delegated to Bash, PowerShell.

My proposal is to have an output flag (-o, --ouput)

python -m basana.external.bitstamp.tools.download_bars -c BTC/USD -p 1d -s 2014-01-01 -e 2021-01-31 -o bitstamp_btcusd_day.csv

Any opinion?

@femtotrader could you please share the output of locale command and also one of those csv files ? Besides adding -o/--output I'd like to support those csv files, as long as they have a BOM.

Here is content of locale command

LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

File is available at https://we.tl/t-2hEwoIbeDz

I was not expecting that.. I was expecting to see a wide char encoding.. but in any case I managed to reproduce the issue. The next version will have this fixed Thanks a lot for the bug report.