matthewwardrop/formulaic

output="sparse" requires df.index=row_number otherwise `ValueError: row index exceeds matrix dimensions`

seanv507 opened this issue · 2 comments

df = pd.DataFrame({"a": ["41544"], "y":[0]})
df.index = [47870]
y,X = formulaic.model_matrix("y ~ a",
    df,
    ensure_full_rank=False,
    output="sparse"
)

using 0.3.4

commenting out output - "sparse" or not setting df.index = [47870] removes the problem

Thanks for reporting!

@seanv507 Sorry for the delay! 0.4.0 has now been released :).