/EnvisEdge

Deploy recommendation engines with Edge Computing

Primary LanguagePythonApache License 2.0Apache-2.0


EnvisEdge EnvisEdge
Envision the Edge like never before...

Lisence Activity Discord OpenIssues

Sparkline

EnvisEdge allows users to simulate an edge computing environment to test their ideas and models before putting them in place on the edge. It takes care of all the complex stuff such as diversity across operating systems, computation power and communication mediums, allowing you to focus on the idea rather than the setup.

EnvisEdge allows researchers, developers and data scientists to experiment and test their hypotheses, and produce production-ready code without having direct access to the edge devices. Creating a path for global research and growth in the domains of federated learning and edge computing.

Key features 🌟

  1. Provides a platform for global or remote teams to run and test their systems/models prior to deployment.
  2. Run, train and test FL algorithms and ML models.
  3. Can setup environment of your choice with any arbitrary hardware constraints such as RAM, CPU and more.
  4. Experience Edge on cloud and your devices.

Repo Structure 🏢

NimbleEdge/EnvisEdge
├── CONTRIBUTING.md                         <-- Please go through the contributing guidelines before starting 🤓
├── README.md                               <-- You are here 📌
├── datasets                                <-- Sample datasets
├── docs                                    <-- Tutorials and walkthroughs 🧐
├── experiments                             <-- Recommendation models used by our services
└── fedrec                                  <-- Whole magic takes place here 😜 
     ├── communication_interfaces              <-- Modules for communication interfaces eg. Kafka
     ├── data_models                           <-- All data modules that will be used for communication and thier serializers and  deserializers
     ├── modules                               <-- All the modules related to transformers, embeddings etc.
     ├── multiprocessing                       <-- Modules to run parallel worker jobs
     ├── optimization                          <-- Modules realted to torch optimizers and gradient decesnt etc.
     ├── python_executors                      <-- Contains worker modules eg. trainer and aggregator
     ├── serialization                         <-- serialization interfaces for data models
     ├── user_modules                          <-- Envis modules for wrapping toech modules for users. 
     └── utilities                             <-- Helper modules
├── fl_strategies                           <-- Federated learning algorithms for our services.
├── notebooks                               <-- Jupyter Notebook examples
├── scala-core                              <-- Backbone of EnvisEdge
├── scripts                                 <-- bash scripts for creating and removing kfka topics.
└── tests                                   <-- tests

QuickStart

Update the config files of the model (can be found here) you are going to use with logging directory:

log_dir:
  PATH: <path to your logging directory>

Download kafka from Here 👈 and start the kafka server using the following commands

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

Create kafka topics for the job executor

cd scripts
$ bash add_topics.sh
Enter path to kafka Directory : <Enter the path to the kafka directory>
kafka url: <Enter the URL on which kafka is listening e.g if you are running it on localhost it would be 127.0.0.1>
Creating Topics...

Install the dependencies using virtual environment

mkdir env
cd env
virtualenv envisedge
source envisedge/bin/activate
pip3 install -r requirements.txt

Download the federated dataset

$ bash download.sh -f
Enter global data path : <Enter the path you want your dataset to be saved>
Enter model : <Enter the config file of the model to update with the dataset path>
Downloading femnist dataset...

Run data preprocessing with preprocess_data . Using this dataset, you will prepare a client_id mapping in the dataset that will be sent to Python workers for training the model.

python preprocess_data.py --config configs/regression.yml

To start the multiprocessing executor run the following command:

$ python executor.py --config configs/regression.yml

To see how traning is done run the following command:

$ python tests/integration_tests/integration_test.py --config configs/regression.yml

Demos and Tutorials

You may find all the EnvisEdge related demos and tutorials here.

You may also find the official documentation here.

Start Contributing

  1. Before you begin, please read our CONTRIBUTOR'S GUIDELINES.
  2. Introduce yourself in the #introduction channel on Discord ( Most of the talks and discussions happen here.)
  3. Look for an open issue that interests you such as good first issue, python, scala, documentation and more. Liverage labels feature as shown below Label wise issue search
  4. Star, Fork, and Clone the repo.
  5. Get down to business. Do your work.
  6. Push to your fork.
  7. Send a PullRequest to NimbleEdge/EnvisEdge.

This project follows the all-contributors specification.Contributions of any kind are welcome!!

License

Apache License 2.0