man-group/arctic

DateRange can not show the date with hour minute second

ichobits opened this issue · 3 comments

Arctic Version

# Version: 1.79.4

Arctic Store

# VersionStore

Platform and version

windows 10 /Ubuntu 20.04 LTS
conda env
python >3.6

Description of problem and/or code sample that reproduces the issue

  1. DateRange can work with day type
    my data :

time_key,open,close,high,low,volume,turnover,pe_ratio,turnover_rate,last_close
2019-10-15 00:00:00,329.4,328.8,331.6,327.6,14202519,4679498788.0,34.595,0.00149,328.2
2019-10-16 00:00:00,330.0,331.0,332.0,328.0,13953191,4604688877.0,34.827,0.00146,328.8
2019-10-17 00:00:00,332.8,331.0,332.8,328.4,10339120,3422301916.0,34.827,0.00108,331.0
2019-10-18 00:00:00,332.0,331.0,334.2,330.2,9904468,3288989931.0,34.827,0.00104,331.0
2019-10-21 00:00:00,329.6,324.8,330.2,324.8,13947162,4557080047.0,33.571,0.00146,331.0
2019-10-22 00:00:00,325.0,327.6,327.8,324.8,10448427,3410907451.0,33.86,0.00109,324.8
2019-10-23 00:00:00,324.8,320.0,325.8,319.6,19855257,6383619077.0,33.074,0.00208,327.6
2019-10-24 00:00:00,319.0,319.0,320.6,316.6,18472498,5883604804.0,32.971,0.00193,320.0

code:

from arctic import Arctic
df = pd.read_csv("tencent.csv",parse_dates=['time_key'],index_col="time_key")
a = Arctic('localhost')
a.initialize_library('vstore')
lib = a['vstore']
lib.write('test', df)
lib.read('test').data

from arctic.date import DateRange
lib.read('test', date_range=DateRange('2020-01-01', '2020-02-01')).data

  1. DateRange can not work
    my data
    date,open,high,low,close,amount,volume
    2020-07-15 21:01:00,3744.0,3750.0,3739.0,3748.0,1.908387740908503e-39,14264
    2020-07-15 21:02:00,3747.0,3749.0,3744.0,3746.0,1.9095213913661417e-39,5491
    2020-07-15 21:03:00,3745.0,3746.0,3741.0,3745.0,1.911097852138507e-39,8051
    2020-07-15 21:04:00,3745.0,3745.0,3740.0,3743.0,1.9103145262969496e-39,5677
    2020-07-15 21:05:00,3743.0,3746.0,3743.0,3745.0,1.9106340223468156e-39,3377
    2020-07-15 21:06:00,3745.0,3747.0,3744.0,3744.0,1.9117424494320966e-39,3388
    2020-07-15 21:07:00,3745.0,3747.0,3744.0,3747.0,1.9117928961768123e-39,2246
    2020-07-15 21:08:00,3747.0,3748.0,3746.0,3748.0,1.9121404181959648e-39,2127
    2020-07-15 21:09:00,3747.0,3755.0,3747.0,3752.0,1.9137407010422238e-39,9474

df3 = pd.read_csv("df.csv",parse_dates=['date'],index_col="date")
lib.write('test3', df3)
lib.read('test3').data
lib.read('test3', date_range=DateRange('2020-08-01', '2020-12-01')).data

error:
lib.read('test', date_range=DateRange('2020-08-01', '2020-12-01')).data
Traceback (most recent call last):
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\series.py", line 1001, in setitem
self._set_with_engine(key, value)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\series.py", line 1034, in _set_with_engine
loc = self.index._engine.get_loc(key)
File "pandas_libs\index.pyx", line 413, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas_libs\index.pyx", line 420, in pandas._libs.index.DatetimeEngine.get_loc
TypeError: 'slice(numpy.datetime64('2020-08-01T00:00:00.000000'), numpy.datetime64('2020-12-01T00:00:00.000000'), None)' is an invalid key
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas_libs\index.pyx", line 444, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1032, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1039, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 1596240000000000000
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 2898, in get_loc
return self._engine.get_loc(casted_key)
File "pandas_libs\index.pyx", line 413, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas_libs\index.pyx", line 446, in pandas._libs.index.DatetimeEngine.get_loc
KeyError: Timestamp('2020-08-01 00:00:00')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\datetimes.py", line 625, in get_loc
return Index.get_loc(self, key, method, tolerance)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 2900, in get_loc
raise KeyError(key) from err
KeyError: Timestamp('2020-08-01 00:00:00')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
lib.read('test', date_range=DateRange('2020-08-01', '2020-12-01')).data
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\arctic\store\version_store.py", line 369, in read
date_range=date_range, read_preference=read_preference, **kwargs)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\arctic\store\version_store.py", line 453, in _do_read
data = handler.read(self._arctic_lib, version, symbol, from_version=from_version, **kwargs)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\arctic\store_pandas_ndarray_store.py", line 202, in read
item = super(PandasDataFrameStore, self).read(arctic_lib, version, symbol, **kwargs)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\arctic\store_pandas_ndarray_store.py", line 109, in read
item = self._daterange(item, date_range)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\arctic\store_pandas_ndarray_store.py", line 101, in _daterange
mask[start:end] = 1.0
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\series.py", line 1027, in setitem
self._set_with(key, value)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\series.py", line 1041, in _set_with
indexer = self.index._convert_slice_indexer(key, kind="getitem")
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 3193, in _convert_slice_indexer
indexer = self.slice_indexer(start, stop, step, kind=kind)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\datetimes.py", line 718, in slice_indexer
return Index.slice_indexer(self, start, end, step, kind=kind)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 4969, in slice_indexer
start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 5172, in slice_locs
start_slice = self.get_slice_bound(start, "left", kind)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 5092, in get_slice_bound
raise err
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\base.py", line 5086, in get_slice_bound
slc = self.get_loc(label)
File "C:\Users\asus\anaconda3\envs\quan\lib\site-packages\pandas\core\indexes\datetimes.py", line 627, in get_loc
raise KeyError(orig_key) from err
KeyError: numpy.datetime64('2020-08-01T00:00:00.000000')

Will take a look

@ichobits I am not sure if I understand the issue properly, do you mean that date_range query works for the first example and not the second?

I'm getting a similar error. Has this been solved yet?