dataframe committed to body does not retain column names
Closed this issue · 1 comments
chriswhong commented
With this code, I observed that the output dataset has field1
, field2
, etc in its structure
when the source dataset and dataframe had defined column names.
load("http.star", "http")
load("encoding/csv.star", "csv")
load("dataframe.star", "dataframe")
ds = dataset.latest()
---
# get unique female names that start with V
# download CSV from the web, parse it and assign to a dataframe
csvDownloadUrl = "https://data.cityofnewyork.us/api/views/25th-nujf/rows.csv?accessType=DOWNLOAD"
res = http.get(csvDownloadUrl).body()
foo = dataframe.parse_csv(res)
---
# filter for first names that start with 'V'
foo = foo[[x.startswith('V') for x in foo["Child's First Name"]]]
# filter for female
foo = foo[[x == 'FEMALE' for x in foo["Gender"]]]
# get unique
# namesOnly = foo["Child's First Name"]
# print(namesOnly.unique())
# foo = foo[foo[0]]
ds.body = foo
# print(ds.body)
dataset.commit(ds)
dustmop commented
Should be fixed by qri-io/qri#1923. However, the new behavior has some known bugs, for example, column descriptions are not retained.