zoofs is a Python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.
PythonApache-2.0
zoofs ( Zoo Feature Selection )
zoofs is a Python library for performing feature selection using an variety of nature inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics based to Evolutionary.
It's easy to use ,flexible and powerful tool to reduce your feature size.
Installation
Using pip
Use the package manager to install zoofs.
pip install zoofs
Available Algorithms
Algorithm Name
Class Name
Description
Particle Swarm Algorithm
ParticleSwarmOptimization
Utilizes swarm behaviour
Grey Wolf Algorithm
GreyWolfOptimization
Utilizes wolf hunting behaviour
Dragon Fly Algorithm
DragonFlyOptimization
Utilizes dragonfly swarm behaviour
Genetic Algorithm Algorithm
GeneticOptimization
Utilizes genetic mutation behaviour
Gravitational Algorithm
GravitationalOptimization
Utilizes newtons gravitational behaviour
[Try It Now?]
Usage
Define your own objective function for optimization !
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportParticleSwarmOptimization# create object of algorithmalgo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your resultsalgo_object.plot_history()
Suggestions for Usage
As available algorithms are wrapper algos. It is better to use ml models that build quicker, e.g lightgbm, catboost.
Take sufficient amount for 'population_size' , as this will determine the extent of exploration and exploitation of the algo.
Ensure that your ml model has its hyperparamters optimized before passing it to zoofs algos.
objective score plot
Algorithms
Particle Swarm Algorithm
class zoofs.ParticleSwarmOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,c1=2,c2=2,w=0.9)
Parameters
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50
Number of time the algorithm will run
population_size : int, default=50
Total size of the population
minimize : bool, default=True
Defines if the objective value is to be maximized or minimized
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The Validation target values .
verbose : bool,default=True
Print results for iterations
Returns
best_feature_list : array-like
Final best set of features
plot_history()
Plot results across iterations
Example
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportParticleSwarmOptimization# create object of algorithmalgo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True,c1=2,c2=2,w=0.9)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your resultsalgo_object.plot_history()
Grey Wolf Algorithm
class zoofs.GreyWolfOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
Parameters
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50
Number of time the algorithm will run
population_size : int, default=50
Total size of the population
minimize : bool, default=True
Defines if the objective value is to be maximized or minimized
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The Validation target values .
method : {1, 2}, default=1
Choose the between the two methods of grey wolf optimization
verbose : bool,default=True
Print results for iterations
Returns
best_feature_list : array-like
Final best set of features
plot_history()
Plot results across iterations
Example
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportGreyWolfOptimization# create object of algorithmalgo_object=GreyWolfOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,method=1,verbose=True)
#plot your resultsalgo_object.plot_history()
Dragon Fly Algorithm
class zoofs.DragonFlyOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
Parameters
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50
Number of time the algorithm will run
population_size : int, default=50
Total size of the population
minimize : bool, default=True
Defines if the objective value is to be maximized or minimized
Choose the between the three methods of Dragon Fly optimization
verbose : bool,default=True
Print results for iterations
Returns
best_feature_list : array-like
Final best set of features
plot_history()
Plot results across iterations
Example
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportDragonFlyOptimization# create object of algorithmalgo_object=DragonFlyOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, method='sinusoidal', verbose=True)
#plot your resultsalgo_object.plot_history()
Genetic Algorithm
class zoofs.GeneticOptimization(objective_function,n_iteration=20,population_size=20,selective_pressure=2,elitism=2,mutation_rate=0.05,minimize=True)
Parameters
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
The function must return a value, that needs to be minimized/maximized.
n_iteration: int, default=50
Number of time the algorithm will run
population_size : int, default=50
Total size of the population
selective_pressure: int, default=2
measure of reproductive opportunities for each organism in the population
elitism: int, default=2
number of top individuals to be considered as elites
mutation_rate: float, default=0.05
rate of mutation in the population's gene
minimize: bool, default=True
Defines if the objective value is to be maximized or minimized
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The Validation target values .
verbose : bool,default=True
Print results for iterations
Returns
best_feature_list : array-like
Final best set of features
plot_history()
Plot results across iterations
Example
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportGeneticOptimization# create object of algorithmalgo_object=GeneticOptimization(objective_function_topass,n_iteration=20,
population_size=20,selective_pressure=2,elitism=2,
mutation_rate=0.05,minimize=True)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train,X_valid, y_valid, verbose=True)
#plot your resultsalgo_object.plot_history()
Gravitational Algorithm
class zoofs.GravitationalOptimization(self,objective_function,n_iteration=50,population_size=50,g0=100,eps=0.5,minimize=True)
Parameters
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
The function must return a value, that needs to be minimized/maximized.
n_iteration: int, default=50
Number of time the algorithm will run
population_size : int, default=50
Total size of the population
g0: float, default=100
gravitational strength constant
eps: float, default=0.5
distance constant
minimize: bool, default=True
Defines if the objective value is to be maximized or minimized
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
The Validation target values .
verbose : bool,default=True
Print results for iterations
Returns
best_feature_list : array-like
Final best set of features
plot_history()
Plot results across iterations
Example
fromsklearn.metricsimportlog_loss# define your own objective function, make sure the function receives four parameters,# fit your model and return the objective value ! defobjective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
returnP# import an algorithm ! fromzoofsimportGravitationalOptimization# create object of algorithmalgo_object=GravitationalOptimization(objective_function,n_iteration=50,
population_size=50,g0=100,eps=0.5,minimize=True)
importlightgbmaslgblgb_model=lgb.LGBMClassifier()
# fit the algorithmalgo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, verbose=True)
#plot your resultsalgo_object.plot_history()
Support zoofs
The development of zoofs relies completely on contributions.
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.