Frequent log messages to run `list_symbols` on some of my libraries
Closed this issue · 10 comments
Describe the bug
How can I get rid of this log messaging?
See here for more information: https://docs.arcticdb.io/technical/on_disk_storage/#symbol-list-caching
To resolve, run `list_symbols` through to completion frequently.
Note: This warning will only appear once.
20240920 15:34:01.207517 3812 E arcticdb.root | E_ASSERTION_FAILURE Cannot write string symbol name, existing symbols are numeric
20240920 15:34:01.515011 3812 W arcticdb.symbol | Ignoring error while trying to compact the symbol list: E_ASSERTION_FAILURE Cannot write string symbol name, existing symbols are numeric
I already did what is suggested in the message running:
symbol_list = library.list_symbols()
on every read and write operation.
Also this log message did not appears once but on every read or write operation for me.
# My current code executes this on every read and write op
# Is this why the log appears multiple times?
ac = self.__init_db()
...
ac.get_library(lib, create_if_missing=False)
Is there some why to supress this log message?
Steps/Code to Reproduce
See here for more information: https://docs.arcticdb.io/technical/on_disk_storage/#symbol-list-caching
To resolve, run `list_symbols` through to completion frequently.
Note: This warning will only appear once.
20240920 15:34:01.207517 3812 E arcticdb.root | E_ASSERTION_FAILURE Cannot write string symbol name, existing symbols are numeric
20240920 15:34:01.515011 3812 W arcticdb.symbol | Ignoring error while trying to compact the symbol list: E_ASSERTION_FAILURE Cannot write string symbol name, existing symbols are numeric
Expected Results
Unncessary log messaging
OS, Python Version and ArcticDB Version
Python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:12:32) [MSC v.1929 64 bit (AMD64)]
OS: Windows-10-10.0.22621-SP0
ArcticDB: 4.5.0
Backend storage used
AWS S3
Additional Context
No response
Thanks for reporting this. The issue is, I think, explained by the E_ASSERTION_FAILURE Cannot write string symbol name, existing symbols are numeric
.
We have a under-documented feature here, which is that you can use integers as symbol names, e.g.,
lib.write(9, df)
lib.read(9).data
However, you can't currently mix these with string symbol names as you'll get this issue.
If that is the problem here, then the work-around is to
- create your symbols with all string names,
lib.write(str(symbol), df)
. - delete any symbols with integer names.
- I would also call
lib.reload_symbol_list()
to make sure that you have a consistent symbol-list
I terms of fixing this issue, I think there are two options.
Either
- deprecate integer symbol name support and,
- change symbol-list compaction so this error is thrown (not just a log message),
or,
- fix symbol-list compaction to work for both ints and strings.
Thank you @jamesmunro, I will try out your suggested work-around
I've tested the issue a bit wider, and with the lmdb
backend, list_symbols
throws, so doesn't support any number of int
named symbols.
import numpy as np
import arcticdb as adb
lib = adb.Arctic('lmdb://test1').create_library('test')
lib.write(9, np.array([1,2,3]))
lib.list_symbols()
InternalException Traceback (most recent call last)
[<ipython-input-19-c96ae63b6644>](https://localhost:8080/#) in <cell line: 5>()
3 lib = adb.Arctic('lmdb://test1').create_library('test')
4 lib.write(9, np.array([1,2,3]))
----> 5 lib.list_symbols()
1 frames
[/usr/local/lib/python3.10/dist-packages/arcticdb/version_store/library.py](https://localhost:8080/#) in list_symbols(self, snapshot_name, regex)
1470 Symbols in the library.
1471 """
-> 1472 return self._nvs.list_symbols(snapshot=snapshot_name, regex=regex)
1473
1474 def has_symbol(self, symbol: str, as_of: Optional[AsOf] = None) -> bool:
[/usr/local/lib/python3.10/dist-packages/arcticdb/version_store/_store.py](https://localhost:8080/#) in list_symbols(self, all_symbols, snapshot, regex, prefix, use_symbol_list)
2137 log.warning("Cannot use symbol list with all_symbols=True as it only stores undeleted symbols")
2138 use_symbol_list = False
-> 2139 return list(self.version_store.list_streams(snapshot, regex, prefix, use_symbol_list, all_symbols))
2140
2141 def compact_symbol_list(self) -> int:
InternalException: std::bad_variant_access(std::get: wrong index for variant)
lib
Library(Arctic(config=S3(endpoint=s3_name_endpoint_name, bucket=my_bucket_name)), path=equities, storage=s3_storage)
Calling this:
ib.reload_symbol_list()
gave me:
arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Read invalid serialized key
In my AWS S3 Console I went through all my symbols and I could not find any int or float as symbol names, however I have many names that are like "sp500_index", "sp500_index_monthly", euribor_1_month", "fed_funds_6_month_cont", "u.s._midwest_domestic_hot-rolled_coil_steel_commodity_future_cont_month_1"
But they are all strings. Though I did not explicitly called lib.write(str(symbol), df)
I assumed that since the names are represented as a string before the write operation, it would also be stored as a string in ArcticDB after the write operation.
Could maybe this caused the issue? u.s.
in "u.s._midwest_domestic_hot-rolled_coil_steel_commodity_future_cont_month_1"
Hi @philsv. That error suggests a different issues.
arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Read invalid serialized key
It is not recognizing the object in S3 as an ArcticDB object, the key
is the first part in the object name. Have you separately written objects to the S3 bucket? That would explain an unrecognised object.
As a work around, I think at this point it's easiest to remake the library. create_library
then copy over the symbols, for the latest versions that would be: lib.write(symbol, lib.read(symbol).data)
.
If you would like us to help you more with this issue then can you please send us a list of objects in the library?
e.g.
Find the full storage name of the library:
aws s3 ls 's3://<BUCKET>/<LIBRARY>'
then take the full library path printed and list all the items under /vref/
, e.g.
aws s3 ls 's3://<BUCKET>/<LIBRARY>1727270002981265920/vref/'
You can send the results to arcticdb@man.com.
RE: lib.write(str(symbol), df)
If isinstance(symbol, str)
then there is no need. Only if isinstance(symbol, int)
.
@jamesmunro I think what caused the error arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Read invalid serialized key
on my side was using lib.reload_symbol_list()
.
Just a moment ago I was recreating the library and this error poped up:
Traceback (most recent call last):
File "c:\Users\user\anaconda3\envs\py11\Lib\site-packages\arcticdb\version_store\library.py", line 1070, in read
return self._nvs.read(
^^^^^^^^^^^^^^^
File "c:\Users\user\anaconda3\envs\py11\Lib\site-packages\arcticdb\version_store\_store.py", line 1725, in read
read_result = self._read_dataframe(symbol, version_query, read_query, read_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\user\anaconda3\envs\py11\Lib\site-packages\arcticdb\version_store\_store.py", line 1799, in _read_dataframe
return ReadResult(*self.version_store.read_dataframe_version(symbol, version_query, read_query, read_options))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
arcticdb_ext.storage.NoDataFoundException: When trying to read version 0 of symbol `usd_cash_crude_palm_oil_electronic_commodity_future_cont_1_cme`, failed to read key i:usd_cash_crude_palm_oil_electronic_commodity_future_cont_1_cme:0:0xde40806327158334@1702432447748890800[1274659200000000000,1366588800000000001]: Not found: Composite: i:usd_cash_crude_palm_oil_electronic_commodity_future_cont_1_cme:0:0xde40806327158334@1702432447748890800[1274659200000000000,1366588800000000001],
In terminal:
20240925 16:43:33.846556 13032 W arcticdb.storage | Failed to find segment for key 'i:usd_cash_crude_palm_oil_electronic_commodity_future_cont_1_cme:0:0xde40806327158334@1702432447748890800[1274659200000000000,1366588800000000001]' : No response body.
I think that might have been the issue.
Fortunately this is just an old dataset we currently are not using. But I can't really tell what was the exact issue with the dataset.
For all the other symbols in the library, no issues.
Have you separately written objects to the S3 bucket?
If you mean loading the data in batches. No I did not. But I very frequently (weekly, daily) update the datasets.
Not found: Composite: i:
is saying that you're missing an index object. That shouldn't be possible normally, as it's written by ArcticDB before the object that refers to it. This would suggest that it's been removed (and not by ArcticDB), or there is a bug here.
These errors your getting are all pointing to either missing or malformed objects in the S3 bucket. I think we would need to understand your environment and setup better to get to the bottom of this.
- You're using AWS S3?
- Are you using any kind of proxy on top of S3?
- Are you reading and writing to the same bucket? - i.e. are you using replicated buckets?
- Are you able to provide a script that replicates the errors?
You can also get a detailed log with ARCTICDB_AWS_LogLevel_int=6
, see: https://docs.arcticdb.io/latest/runtime_config/#logging-configuration. I wouldn't post the result of that here as it may contain information you don't want to share.
Closing as no reply