TaskCSVPandas loses index in the csv file output file
wj-c opened this issue · 3 comments
wj-c commented
An example:
import d6tflow
import pandas as pd
import numpy as np
class test(d6tflow.tasks.TaskCSVPandas):
def run(self):
data = np.random.randn(2, 2)
data = pd.DataFrame(data, index=['a', 'b'], columns=['c', 'd'])
self.save(data)
d6tflow.run(test())
The actual data in the output csv file is:
c,d
1.5490553923182304,-0.3279984496021263
0.7946535471877705,0.5790784973358706
However, the expected data in the output file should be:
,c,d
a,0.20200720089562157,0.5134288778567592
b,2.918867471040273,-0.5393324706416279
The index is lost when using TaskCSVPandas to save data.
wj-c commented
Is the repository actively maintained?
d6tdev commented
Yes that's a feature, you can use self.save(data,index=True)
if you want to keep index. Or better still use TaskPqPandas
instead of TaskCSVPandas
.
d6tdev commented
And yes is actively maintained