"Datatype coercion is not allowed" when creating session with custom timeseries array
warrickball opened this issue · 3 comments
Here's a script that creates a basic timeseries of Gaussian noise in a 2×1000 array.
#!/usr/bin/env python3
import numpy as np
import pbjam
n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)
data[1] = np.random.randn(n)
s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), dnu=(5, 0.1), timeseries=data)
It fails with the following traceback:
Traceback (most recent call last):
File "/home/wball/try/pbjam/mwe.py", line 11, in <module>
s = pbjam.session(ID='mwe', numax=(100, 1), teff=(5000, 100), bp_rp=(0.7, 0.005), timeseries=data)
File "/home/wball/pypi/PBjam/pbjam/session.py", line 572, in __init__
_format_col(vardf, timeseries, 'timeseries')
File "/home/wball/pypi/PBjam/pbjam/session.py", line 289, in _format_col
vardf[key] = [_arr_to_lk(x, y, vardf['ID'][0], key)]
File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3163, in __setitem__
self._set_item(key, value)
File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3239, in _set_item
value = self._sanitize_column(key, value)
File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3899, in _sanitize_column
value = maybe_convert_platform(value)
File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 112, in maybe_convert_platform
values = construct_1d_object_array_from_listlike(values)
File "/home/wball/.local/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1638, in construct_1d_object_array_from_listlike
result[:] = values
File "/home/wball/.local/lib/python3.9/site-packages/astropy/table/table.py", line 853, in __array__
raise ValueError('Datatype coercion is not allowed')
ValueError: Datatype coercion is not allowed
I had a brief look around. The problem isn't the conversion of the timeseries into a Lightkurve object but rather when adding this to the vardf
dataframe.
I just did a git pull
so I'm using the top of master
(commit 0c5591a). If any other versions are relevant, they are:
- Python 3.9.4
- NumPy 1.20.1
- Pandas 1.2.0
- Astropy 4.2
- Lightkurve 2.0.9
@nielsenmb couldn't reproduce this in Python 3.7 and neither can I with Python 3.7.4. I do, however, hit this with Python 3.8.2 and
- NumPy 1.20.3
- Pandas 1.2.4
- Astropy 4.2.1
- Lightkurve 2.0.9
Creating the LightCurve object seems to be fine so I had a closer look at why assigning the timeseries that are downloaded via Lightkurve works but passing a custom timeseries doesn't. Mimicking the code in PBjam, I tried this
import numpy as np
import pandas as pd
import lightkurve as lk
import pbjam
n = 1000
data = np.zeros((2,n))
data[0] = np.arange(n, dtype=float)/720
data[1] = np.random.randn(n)
df = pd.DataFrame({'ID': np.array(['test']).reshape((-1,1)).flatten()})
df['timeseries'] = [lk.LightCurve(time=data[0], flux=data[1])]
which is similar to the code path followed for custom timeseries and reproduces the Datatype coercion is not allowed
error message.
If I change the last line to this, which tries to be more like the path for downloaded objects (i.e. when you pass a string identifier), it appears to work:
df.at[0, 'timeseries'] = lk.LightCurve(time=data[0], flux=data[1], targetid='test')
vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key)
returns an error for me on Python 3.7, but
vardf[key] = object()
vardf.at[0, key] = _arr_to_lk(x, y, vardf['ID'][0], key)
seems to work