NameError: name 'filter_by' is not defined
Opened this issue · 7 comments
Use mask() instead:
from dfply import *
# "diamonds" is a DataFrame supplied by dfply for testing purposes
diamonds >> mask(X.color == 'E')
This is one of the few instances where the Python API is different to the R API. The reason is a clash with an existing Python API in a different package.
I have been using dfply for years, and it works perfectly every time.
It looks like you are using the wrong syntax.
-
Try some of the examples in the dfply readme at:
https://github.com/kieferk/dfply -
Try upgrading from Python 3.0 to Python 3.6.
-
Try this example:
from dfply import *
# Diamonds is a built in dataframe. Prints first three rows.
diamonds >> head(3)
# Demo of different types of select. Prints column 1, then column "price", then columns "x" and "y".
diamonds >> select(1, X.price, ['x', 'y']) >> head(2)
Years! Has it really been around that long? Crazy.
If none of the functions work for you some other strange issue is going on. And for the record, yes as @sharpe5 says filter
is a reserved name in python and so I switched the name to mask.
The reason it's not working is the dfply
version in pip repository is from Aug 2017, whereas the filter_by
alias was included merely five months ago.
Therefore, the package on pip should probably be updated.
Can we close this issue for now?