PyDataStructs

About

PyDataStructs project aims to be a Python package for various data structures and algorithms (including their parallel implementations).
We are also working on providing C++ backend via Python C-API for high performance use cases.

Why PyDataStructs?

Single package for all your data structures and algorithms
Consistent and Clean Interface - The APIs we have provided are consistent with each other, clean, and easy to use. We make sure of that before adding any new data structure or algorithm.
Well Tested - We thoroughly test our code before making any new addition to PyDataStructs. 99 percent lines of our code have already been tested by us.

Installation

If you are using Anaconda/Mamba, you can setup your development environment by executing the following commands,

conda env create --file environment.yml
conda activate pyds-env

You can install the library by running the following command,

python scripts/build/install.py

For development purposes i.e., if you intend to be a contributor,

python scripts/build/develop.py

Make sure you change your working directory to pydatastructs before executing any of the above commands. Also, your python version should be at least 3.8.

Testing

For testing your patch locally follow the steps given below,

Install pytest-cov. Skip this step if you are already having the package.
Run, python3 -m pytest --doctest-modules --cov=./ --cov-report=html. Look for, htmlcov/index.html and open it in your browser, which will show the coverage report. Try to ensure that the coverage is not decreasing by more than 1% for your patch.

For a good visualisation of the different data structures and algorithms, refer the following websites:

You can use the examples given in the following book as tests for your code:

https://opendatastructures.org/ods-python.pdf

Light weighted testing (without benchmarks)

Make sure you have activated the conda environment: pyds-env and your working directory is ../pydatastructs.

In the terminal, run: python -c "from pydatastructs.utils.testing_util import test; test()".

This will run all the test files, except benchmark tests. This should be used if benchmark tests are computationally too heavy to be run on your local machine.

Why do we use Python?

As we know Python is an interpreted language and hence executing programs in it is slower as compared to C++.
We still decided to use Python because the software development can happen at a much faster pace and it is much easier to test various software designs and APIs as coding them out takes no time in Python.
However, keeping the need of the users in mind, we are also working on providing a C++ backend, which will happen quickly as we would be required to just translate the tested code rather than writing it from scratch.

How to contribute?

Follow the steps given below,

Fork, https://github.com/codezonediitj/pydatastructs/
Execute, git clone https://github.com/codezonediitj/pydatastructs/
Change your working directory to ../pydatastructs.
Execute, git remote add origin_user https://github.com/<your-github-username>/pydatastructs/
Execute, git checkout -b <your-new-branch-for-working>.
Make changes to the code.
Add your name and email to the AUTHORS, if you wish to.
Execute, git add ..
Execute, git commit -m "your-commit-message".
Execute, git push origin_user <your-current-branch>.
Make PR.

That's it, 10 easy steps for your first contribution. For future contributions just follow steps 5 to 10. Make sure that before starting work, always checkout to master and pull the recent changes using the remote origin and then start following steps 5 to 10.

See you soon with your first PR.

It is recommended to go through the following links before you start working.

Guidelines

We recommend you to join our discord channel for discussing anything related to the project.

Please follow the rules and guidelines given below,

Follow the numpydoc docstring guide.
If you are planning to contribute a new data structure then first raise an issue for discussing the API, rather than directly making a PR. Please go through Plan of Action for Adding New Data Structures.
For the first-time contributors we recommend not to take a complex data structure, rather start with beginner or easy.
We don't assign issues to any individual. Instead, we follow First Come First Serve for taking over issues, i.e., if one contributor has already shown interest then no comment should be made after that as it won't be considered. Anyone willing to work on an issue can comment on the thread that he/she is working on and raise a PR for the same.
Any open PR must be provided with some updates after being reviewed. If it is stalled for more than 4 days, it will be labeled as Please take over, meaning that anyone willing to continue that PR can start working on it.
PRs that are not related to the project or don't follow any guidelines will be labeled as Could Close, meaning that the PR is not necessary at the moment.

The following parameters are to be followed to pass the code quality tests for your Pull Requests,

There should not be any trailing white spaces at any line of code.
Each .py file should end with exactly one new line.
Comparisons involving True, False, and None should be done by reference (using is, is not) and not by value(==, !=).

Keep contributing!!

Thanks to these wonderful people ✨✨:

codezonediitj/pydatastructs