/pypso

Primary LanguagePythonMIT LicenseMIT

pypso

pypso is a Python package that provides naive implementations of classic particle swarm optimization algorithms. The purpose of this project is to demonstrate one of the many ways to create a reusable library for scientific computing. This library demonstrates some common development tools and practices for working with Python, including:

  • Version control
  • Unit tests and code coverage
  • Static typing
  • Linting

Other important tools for developing Python packages, such as automatically generating html documentation, continuous integration, and continuous deployment software, are not discussed here.

Getting Started

To install the package, clone the repository

git clone https://github.com/rmill040/pypso.git

cd into the directory and run the setup.py script

cd pypso
python setup.py install

An alternative way to install the package is in "editable" mode which allows you to live edit the installed package

pip install -e .

Note: During development, the editable install will be your best friend.

After install if you run the script pypso_version, you should get a similar message

pypso_version
pypso version = 0.0.2 successfully imported!

To see some applications of particle swarm optimization for machine learning applications, run the examples/simple_ml.py script. See partial output below,

--------------------------------------------------
EXAMPLE 1 - CONTINUOUS PSO WITHOUT CONSTRAINT
--------------------------------------------------
[2019-11-08 19:35:24,612] INFO - initializing swarm
[2019-11-08 19:35:24,762] INFO - new swarm best: 1/100 - 0.016674594366048234
[2019-11-08 19:35:24,894] INFO - new swarm best: 2/100 - 0.013860261085566372
[2019-11-08 19:35:25,026] INFO - new swarm best: 3/100 - 0.011204481792717158
[2019-11-08 19:35:25,164] INFO - new swarm best: 4/100 - 0.009896411394746618
[2019-11-08 19:35:25,296] INFO - new swarm best: 5/100 - 0.008297658686115983
[2019-11-08 19:35:25,557] INFO - new swarm best: 7/100 - 0.007901273717034085
[2019-11-08 19:35:25,950] INFO - new swarm best: 10/100 - 0.007663442735584836
[2019-11-08 19:35:26,475] INFO - new swarm best: 14/100 - 0.007438824586438297
[2019-11-08 19:35:26,611] INFO - new swarm best: 15/100 - 0.007399186089530163
[2019-11-08 19:35:27,001] INFO - new swarm best: 18/100 - 0.007240632101897404
[2019-11-08 19:35:27,284] INFO - new swarm best: 20/100 - 0.007227419269594582
[2019-11-08 19:35:27,802] INFO - new swarm best: 24/100 - 0.007121716611172779
[2019-11-08 19:35:28,200] INFO - new swarm best: 27/100 - 0.007108503778869957
[2019-11-08 19:35:28,459] INFO - new swarm best: 29/100 - 0.007095290946567356
[2019-11-08 19:35:28,721] INFO - optimization converged: 30/100 - stopping criteria below tolerance

Linear solution:
0.23 + -1.9*f1 + -0.19*f2 + -1.4*f3 + 0.24*f4 +
-1.08*f5 + -2.03*f6 + -2.38*f7 + -1.94*f8 + 0.11*f9 +
-0.54*f10 + -1.87*f11 + 2.19*f12 + -1.68*f13 + -0.38*f14 +
-1.05*f15 + -2.39*f16 + 0.98*f17 + -0.35*f18 + 0.9*f19 +
-1.46*f20 + 0.48*f21 + -1.85*f22 + -0.74*f23 + -0.12*f24 +
0.27*f25 + -2.42*f26 + -1.68*f27 + 1.66*f28 + -1.81*f29 +
0.47*f30

Sanity check:
	all weights within bounds? True

Comparison to sklearn:
	sklearn logistic regression AUC = 0.978
	PSO logistic regression AUC     = 0.9929

Package Structure

The pypso package has the following structure (using the tree command):

pypso
├── LICENSE
├── MANIFEST.in
├── README.md
├── examples
│   └── simple_ml.py
├── pypso
│   ├── __init__.py
│   ├── _version.py
│   ├── base
│   │   ├── __init__.py
│   │   └── _models.py
│   ├── optimizers
│   │   ├── __init__.py
│   │   ├── _bpso.py
│   │   └── _cpso.py
│   ├── tests
│   │   ├── __init__.py
│   │   ├── base
│   │   │   ├── __init__.py
│   │   │   └── test_models.py
│   │   ├── optimizers
│   │   │   ├── __init__.py
│   │   │   ├── test_bpso.py
│   │   │   └── test_cpso.py
│   │   └── utils
│   │       ├── __init__.py
│   │       ├── test_plot.py
│   │       └── test_wrapper.py
│   └── utils
│       ├── __init__.py
│       ├── _plot.py
│       └── _wrapper.py
├── requirements.txt
├── scripts
│   └── pso_version
├── setup.cfg
├── setup.py
└── versioneer.py

Files in the root directory:

  • LICENSE: Selected by user during repository creation and automatically generated by GitHub.
  • MANIFEST.in: Used to add and remove files to and from the source distribution (sdist). See this reference for more details.
  • README.md: The main landing page of the GitHub repository. This is what you are reading now.
  • examples: A directory that contains example scripts and notebooks using the library.
  • requirements.txt: Contains dependency information for the library. This can list specific versions of dependencies, pandas==0.25.
  • scripts: A directory that contains relevant scripts for your library.
  • setup.cfg: A configuration template file providing details on settings for various tools such as linting, static type checking, unit tests, etc.
  • setup.py: The main install file.
  • versioneer.py: Tool for managing a recorded version number in distutils-based python projects.

Note, for setup.* files, see setuptools for more details.

Version Control

Clearly version control is an important component for software development. Python has a handy tool called versioneer that allows you to propagate tagged version numbers (using semantic versioning) throughout your repository without the manual effort. To get versioneer up and running, check out the following two files in the repository:

From here, after we commit the code, we can tag the commit

git tag -a vX.Y.Z

to create a tag named vX.Y.Z (note the v is for version). For a description of tagging in GitHub, check out their documentation.

Unit Tests and Code Coverage

Unit testing is a way to break a larger complex codebase into smaller, testable parts. One of the most popular unit test suites in Python is pytest. To run all unit tests along with coverage statistics,

# Ensure you are in the root directory
pytest

Static Typing

Static typing is what we usually expect with compiled languages, but dynamic languages are all beginning to have projects for static type checking. One popular project for Python is mypy. To run static type checking,

# Ensure you are in the root directory
mypy pypso

Linting

Linting is a tool to enforce specific style guides. There are many different options in Python and one of the most popular options is flake8. To run linting checks

# Ensure you are in the root directory
flake8 --show-source

For details about warnings and error codes, check out the flake8 documentation.