altair-viz/altair-transform

error with transform_fold

williehallock802 opened this issue · 7 comments

import pandas as pd
import numpy as np
import altair as alt

data = { 'ColA': {('A', 'A-1'): 'w',
                 ('A', 'A-2'): 'w',
                 ('A', 'A-3'): 'w',
                 ('B', 'B-1'): 'q',
                 ('B', 'B-2'): 'q',
                 ('B', 'B-3'): 'r',
                 ('C', 'C-1'): 'w',
                 ('C', 'C-2'): 'q',
                 ('C', 'C-3'): 'q',
                 ('C', 'C-4'): 'r'},
        'ColB': {('A', 'A-1'): 'r',
                 ('A', 'A-2'): 'w',
                 ('A', 'A-3'): 'w',
                 ('B', 'B-1'): 'q',
                 ('B', 'B-2'): 'q',
                 ('B', 'B-3'): 'e',
                 ('C', 'C-1'): 'e',
                 ('C', 'C-2'): 'q',
                 ('C', 'C-3'): 'r',
                 ('C', 'C-4'): 'w'} 
        }
                 
df = pd.DataFrame(data).reset_index( drop = True )

mychart = alt.Chart(df).transform_fold(
    [r'ColA', 'ColB'], as_=['column', 'value'] 
).mark_bar().encode(
    x=alt.X('value:N', sort=['r', 'q', 'e', 'w']),
    y=alt.Y('count():Q', scale=alt.Scale(domain=[0, len(df.index)])),
    column='column:N'
)

from altair_transform import extract_data
data = extract_data(mychart)
data.head()

generates the error:

altair-transform/altair_transform/core/fold.py in visit_fold(transform, df)
      9     transform = transform.to_dict()
     10     fold = transform["fold"]
---> 11     var_name, value_name = transform._get("as", ("key", "value"))
     12     value_vars = [c for c in df.columns if c in fold]
     13     id_vars = [c for c in df.columns if c not in fold]


AttributeError: 'dict' object has no attribute '_get'

Thanks, I'll try to take a look.

Same issue here.
Removing that underscore before the get solves the issue... and outputs

  column value
0   ColA     w
1   ColA     w
2   ColA     w
3   ColA     q
4   ColA     q

But maybe what's missing is an instance test like in data.py:35

        if isinstance(context, dict):
            datasets = context.get('datasets', {})
        else:
            datasets = context._get('datasets', {})

That would work. This is an instance of general confusion throughout the codebase about whether inputs are dicts or schema objects. I went through a while ago and tried to address most of it, but this is one of the instances I missed (there may be others).

I think rather than an isinstance check each time we need to get an attribute, it would be better to normalize inputs so that we know what they are, and know what methods can be used on them.

Are you interested in working on this?

So looking at this, we've already pre-converted the input to a dict, so just using transform.get() directly should be sufficient. The reason this was not caught is because there is no test of the fold transform (I'm certain I wrote the test when I wrote the code, but I'm not sure what happened to it).

I don't think there's much else to do here. :o)

TODO list:

  • fix the bug
  • add test coverage for fold transform

Thank you for your offer. Unfortunately I didn't know enough about tests to be helpful, sorry.
Now, looking your solution, will help for next time.