/PySyft

A library for answering questions using data you cannot see

Primary LanguagePythonMIT LicenseMIT


PySyft
A library for computing on data
you do not own and cannot see



PySyft is a Python library for secure and private Deep Learning. PySyft decouples private data from model training, using Federated Learning, Differential Privacy, and Encrypted Computation (like Multi-Party Computation (MPC) and Homomorphic Encryption (HE) within the main Deep Learning frameworks like PyTorch and TensorFlow. Join the movement on Slack.

FAQ 0.2.x ➡️ 0.3.x

We have compiled a list of FAQs relating to the change from 0.2.x to 0.3.x+ here.

Important note about PySyft 0.2.x: The PySyft 0.2.x codebase is now in its own branch here, but OpenMined will not offer official support for this version range. If you're getting started with PySyft for the first time, please ignore this message and read on!

PySyft in Detail

A more detailed explanation of PySyft can be found in the white paper on Arxiv.

PySyft has also been explained in videos on YouTube:

Pre-Installation

PySyft is available on PyPI and Conda.

We recommend that you install PySyft within a virtual environment like Conda, due to its ease of use. If you are using Windows, we suggest installing Anaconda and using the Anaconda Prompt to work from the command line.

$ conda create -n pysyft python=3.8
$ conda activate pysyft
$ conda install jupyter notebook

Version Support

We support Linux, MacOS and Windows and the following Python and Torch versions.

Python Torch 1.5 Torch 1.6 Torch 1.7
3.6
3.7
3.8

Installation

Pip

$ pip install syft

This will auto-install PyTorch and other dependencies as required, to run the examples and tutorials. For more information on building from source see the contribution guide here.

Documentation

The latest official documentation is hosted here: https://pysyft.readthedocs.io/

Duet Examples

All the examples can be played with by launching Jupyter Notebook and navigating to the examples/duet folder.

$ jupyter notebook

WebRTC Signaling Server

To facilitate peer-to-peer connections through firewalls we utilise WebRTC and a signaling server. After connection, no traffic is sent to this server.

If you want to run your own signaling server simply run the command:

$ syft-network

Then update your duet notebooks to use the new network_url=http://localhost:5000

Try out the Tutorials

A comprehensive list of Duet Examples can be found here

These tutorials cover how to operate common network types over the Duet API.

High-level Architecture

Start Contributing

The guide for contributors can be found here. It covers all that you need to know to start contributing code to PySyft today.

Also, join the rapidly growing community of 7000+ on Slack. The slack community is very friendly and great about quickly answering questions about the use and development of PySyft!

Troubleshooting

The latest version of PySyft is 0.3.0 however this software is still Beta. If you find a bug please file it in the GitHub issues.

Organizational Contributions

We are very grateful for contributions to PySyft from the following organizations!

Udacity coMind Arkhn Dropout Labs

Support

For support in using this library, please join the #lib_pysyft Slack channel. Click here to join our Slack community!

Disclaimer

This software is in early beta. Use at your own risk.

License

Apache License 2.0 FOSSA Status

Description

Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration in all areas of life and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place.

The Syft ecosystem seeks to change this system, allowing you to write software which can compute over information you do not own on machines you do not have (general) control over. This not only includes servers in the cloud, but also personal desktops, laptops, mobile phones, websites, and edge devices. Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used for computation.

This library is the centerpiece of the Syft ecosystem. It has two primary purposes. You can either use PySyft to:

  1. Dynamic: Directly compute over data you cannot see.
  2. Static: Create static graphs of computation which can be deployed/scaled at a later date on different compute.

The Syft ecosystem includes libraries which allow for communication with and computation over a variety of runtimes:

  • KotlinSyft (Android)
  • SwiftSyft (iOS)
  • syft.js (Web & Mobile)
  • PySyft (Python)

However, the Syft ecosystem only focuses on consistent object serialization/deserialization, core abstractions, and algorithm design/execution across these languages. These libraries alone will not connect you with data in the real world. The Syft ecosystem is supported by the Grid ecosystem, which focuses on the deployment, scalability, and other additional concerns around running real-world systems to compute over and process data (such as data compliance web applications).

Syft is the library that defines objects, abstractions, and algorithms. Grid is the platform which lets you deploy them within a real institution (or on the open internet, but we don't yet recommend this). The Grid ecosystem includes:

  • GridNetwork - think of this like DNS for private data. It helps you find remote data assets so that you can compute on them.
  • PyGrid - This is the gateway to an organization's data, responsible for permissions, load balancing, and governance.
  • GridNode - This is an individual node within an organization's data center, running executions requested by external parties.
  • GridMonitor - This is a UI which allows an institution to introspect and control their PyGrid node and the GridNodes it manages.

Want to Use PySyft?

If you would like to become a user of PySyft, please progress to our User Documentation.

Want to Develop PySyft?

If you would like to become a developer of PySyft, please see our Contributor Documentation. This documentation will help you set up your development environment, give you a roadmap for learning the codebase, and help you find your first project to contribute.

Note

This project has been set up using PyScaffold. For details and usage information on PyScaffold see https://pyscaffold.org/.