An open source python library for non-linear piecewise symbolic regression based on Genetic Programming
- Free software: MIT license
- Documentation: https://pstree.readthedocs.io.
Piece-wise non-linear regression is a long-standing problem in the machine learning domain that has long plagued machine learning researchers. It is extremely difficult for users to determine the correct partition scheme and non-linear model when there is no prior information. To address this issue, we proposed piece-wise non-linear regression tree (PS-Tree), an automated piece-wise non-linear regression method based on decision tree and genetic programming techniques. Based on such an algorithm framework, our method can produce an explainable model with high accuracy in a short period of time.
pip install -U pstree
- A fully automated piece-wise non-linear regression tool
- A fast genetic programming based symbolic regression tool
An example of usage:
X, y = load_diabetes(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
r = PSTreeRegressor(regr_class=GPRegressor, tree_class=DecisionTreeRegressor,
height_limit=6, n_pop=25, n_gen=100,
basic_primitive='optimal', size_objective=True)
r.fit(x_train, y_train)
print(r2_score(y_test, r.predict(x_test)))
print(r.model())
Experimental results on SRBench:
@article{zhang2022ps,
title={PS-Tree: A piecewise symbolic regression tree},
author={Zhang, Hengzhe and Zhou, Aimin and Qian, Hong and Zhang, Hu},
journal={Swarm and Evolutionary Computation},
volume={71},
pages={101061},
year={2022},
publisher={Elsevier}
}
- By the way, I would like to express my gratitude to Qi-Hao Huang from Guangzhou University for pointing out that the "minimize" in formula (4) of the paper should be "maximize", corresponding to the code. (https://github.com/hengzhe-zhang/PS-Tree/blob/master/pstree/cluster_gp_sklearn.py#L320-L346)
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.