Plug-in based feature engineering operations, transform raw data to generate new data that better represent features so they improve the performance of a predictive model.
Implements modular components for feature engineering, it can be expanded by installing plugins, there are three types of plugins:
- Input plugins: load the data to be processed
- Operations plugins: perform feature engineering operations on loaded data
- Output plugins: save the results of the feature engineering operations
It includes some pre-installed configurable plugins:
- Heuristic training signal generator
- MSSA decomposer
- MSSA predictor
- CSV file input and output plugins
Usable both from command line and from class methods library.
To install the package via PIP, use the following command:
pip install -i https://test.pypi.org/simple/ feature-eng
Also, the installation can be made by clonning the github repo and manually installing it as in the following instructions.
TODO: Install kieferk/pymssa via github clone and python setup.py install
- Clone the GithHub repo:
git clone https://github.com/harveybc/feature-eng
- Change to the repo folder:
cd feature-eng
- Install requirements.
pip install -r requirements.txt
- Install python package (also installs the console command data-trimmer)
python setup.py install
- Add the repo folder to the environment variable PYTHONPATH
- (Optional) Perform tests
python setup.py test
- (Optional) Generate Sphinx Documentation
python setup.py docs
- clone pymssa from GithHub
git clone https://github.com/harveybc/pymssa
- cd to the pymssa directory
cd pymssa
- install pymssa
python setup.py install
feature_eng is implemented as a console command:
feature_eng --help
- --list_plugins: Shows a list of available plugins.
- --core_plugin <ops_plugin_name>: Feature engineering core operations plugin to process an input dataset.
- --input_plugin <input_lugin_name>: Input dataset importing plugin. Defaults to csv_input.
- --output_plugin <output_plugin_name>: Output dataset exporting plugin. Defaults to csv_output.
The following examples show both the class method and command line uses for one module, for examples of other plugins, please see the specific module´s documentation.
feature_eng --list_plugins
feature_eng --core_plugin heuristic_ts --input_file "tests/data/test_input.csv"
The following example show how to configure and execute the core plugin.
from feature_eng.feature_eng import FeatureEng
# configure parameters (same variable names as command-line parameters)
class Conf:
def __init__(self):
self.core_plugin = "heuristic_ts"
self.input_file = "tests/data/test_input.csv"
# initialize instance of the Conf configuration class
conf = Conf()
# initialize and execute the core plugin, loading the dataset with the default feature_eng
# input plugin (load_csv), and saving the results using the default output plugin (store_csv).
fe = FeatureEng(conf)
```re()
All the plugin modules and their CLI commands are installed with the feature-eng package, the following sections describe each module briefly and link to each module's basic documentation.
Additional detailed Sphinix documentation for all modules can be generated in HTML format with the optional step 7 of the installation process, it contains documentation of the classes and methods of all modules in the feature-eng package.
Generates an ideal training signal for trading using EMA_fast forwarded a number of ticks minus current EMA_slow as buy signal.
See heuristic_ts Readme for detailed description and usage instructions.
Performs MSSA decomposition, save the output dataset containing a configurable number of components per feature or the sum of a configurable number of components.
See MSSA Decomposer Readme for detailed description and usage instructions.
Performs MSSA prediction for a configurable number of forward ticks, save the .output dataset containing the prediction for a configurable number of channels or its sum.
See MSSA Predictor Readme for detailed description and usage instructions.
To create a plugin, there are two ways, the first one allows to install the plugin from an external python package using setuptools and is useful for testing your plugins, the second way is to add a new pre-installed plugin to the feature-eng package by making a pull request to my repo so i can review it and merge it. Both methods are described in the following sections.
The following procedure allows to create a plugin as a python package with setuptools, install it, verify that is installed and use the plugin.
- Create a new package with the same directory structure of the standardizer plugin example
- Edit the setup.py or setup.cfg and add your package name as a feature_eng plugin (with a correspondent plugin name) in the entry_points section as follows:
setup( ... entry_points={'feature_eng.plugins': '<PLUGIN_NAME> = <YOUR_PACKAGE_NAME>'}, ... )
- Install your package as usual
python setup.py install
- Verify that your plugin was registered
feature_eng --list_plugins Check that <PLUGIN_NAME> appears in the list of installed plugins.
- Use your newly installed plugin
feature_eng --core_plugin <PLUGIN_NAME> --plugin_option1 --plugin_option2 ...
The following procedure allows to contribute to the feature_eng repository by creating a new plugin to be included in the pre-installed plugins.
- Fork the feature_eng repository via the github homepage
- Clone your fork using github Desktop or via command line into a local directory
- Create a new branch called with the name of the new plugin using github Desktop and select it
- Cd to the feature_eng fork directory
- Create the new module implementation inside the plugins directory, following the structure of the existing plugins
- Create the new module tests inside the tests directory, following the structure of the existing tests
- Make a commit and push to save your changes to github
- Make a Pull Request to the master branch of my feature_eng repo so i can review the changes and merge them with my existing code.
More detailed collaboration instructions soon.