de_pyspark_tips = Data Engineer Pyspark Tips
First generate wheel file for install with
python setup.py bdist_wheel
Import the module
import de_pyspark_tips.df_transformations as ET
castedDF = ET.cast_columns_types_by_schema(df, list_schema, True)
filterDF = ET.choose_last_row_modify_by_ids(castedDF, ['id'], ['data_transaction'])
- Python 3.7
- JAVA 1.8
- SPARK 3.2
- Pyspark >=3.1,<3.3
pip install "pyspark >=3.2,<3.3"
pytest -v -s