SyferText: A Python repository from chindimaga

SyferText

SyferText is a library for privacy preserving Natural Language Processing in Python. It leverages PySyft to perform Federated Learning and Encrypted Computations (Multi-Party Computation (MPC)) on text data. The two main usage scenarios of SyferText are:

🔥 Secure plaintext pre-processing: Enables pre-processing of text located on a remote machine without breaking data privacy.
🚀 Secure pipeline deploy: Starting from version 0.1.0, SyferText will be able to bundle the complete pipeline of pre-processing components and trained PySyft models and to securely deploy it to PyGrid.

To get a more detailed introduction about SyferText, watch 🎥 OpenMined AMA with Alan Aboudib available on YouTube.

Installation

In order to install and start using SyferText, you first have to install git-lfs by following this short guide.

Then go ahead and install our experimental language model that we adapted form spaCy's en_core_web_lg model. This should take a few minutes since the model size is >800M.

$ pip install git+git://github.com/Nilanshrajput/syfertext_en_core_web_lg@master

If you had already installed syfertext_en_core_web_lg prior to installing git-lfs please do the following:

Uninstall syfertext_en_core_web_lg
Install git-lfs.
Reinstall syfertext_en_core_web_lg.

Now you can go ahead and install SyferText:

$ git clone https://github.com/OpenMined/SyferText.git
$ cd SyferText
$ python setup.py install

That's it, you are good to go!

Getting Started

SyferText can be used to work with datasets residing on a local machine (or a local worker as we call it in PySyft), as well as with private datasets on remote workers. Here is a list of tutorials that you can follow to get more familiar with SyferText:

Code Examples	Use Cases
1. Tokenizing local strings	1. Training a sentiment classifier on multiple private datasets
2. Tokenizing remote strings
3. Using the SimpleTagger

More tutorials are coming soon. Stay tuned!

Our Team

SyferText is created and maintained by the NLP team at OpenMined and by volunteer contributors from all around the world. Here are the current members of the core NLP team. The team is growing!

_{Alan Aboudib} _{Team Lead / Author}	_{Nilansh Rajput} _{OM NLP team / Core Dev}	_{Jatin Prakash} _{OM NLP team / Core Dev}	_{Sachin Kumar} _{OM NLP team / Core Dev}
_{Bachir Chihani} _{OM NLP Team / Core Dev}	_{Márcio Porto} _{OM NLP team / Core Dev}	_{Antonio Lopardo} _{OM NLP team / Documentation}

Events

(October 26th, 2019) DevFest2019, Reading, UK.

Demo on remote blind tokenization with SyferText.

(March 19th, 2020) GDG Meetup, Reading, UK. (Cancelled due to COVID-19)

Demo on sentiment analysis with SyferText on multiple private datasets.

(May 13th, 2020): OpenMined AMA. (Cancelled due to COVID-19)
(June 17th, 2020): OpenMined AMA.

SyferText vision and encrypted sentiment analyzer demo.

(June 18th, 2020): The Federated Learning Conference.

Introduction to SyferText.

(July 8th, 2020): OpenMined Paris Meetup.

SyferText vision and encrypted sentiment analyzer demo.

(July 29th, 2020): MLH Fellowship Talk.

About SyferText and my Open Source Contribution Experience with OpenMined

(September 16th, 2020 at 5:30PM GMT): OpenMined AMA. (Cancelled)

Introducing SyferText 0.1.0

News

To get news about feature and tutorial relseases:

Alan Aboudib: @twitter

and join #lib_syfertext channel on slack.

Support

To get support in using this library, please join the #lib_syfertext Slack channel. If you’d like to follow along with any code changes to the library, please join the #code_syfertext Slack channel. Click here to join our Slack community!

Contributors ✨

CONTRIBUTORS.md

This project follows the all-contributors specification. Contributions of any kind are welcome!

Call for Partners

We, at the NLP team, are eager to learn about new real-world use-cases around which new features in SyferText could be built.

If you think that SyferText, in its current state or by adding more features, could be useful to your research or company, please contact us as indicated below in the Contact Us section, and let us discuss how we can help.

Contact Us

You can reach out to us by contacting Alan on one of the following channels:

LinkedIn | Slack | Twitter

License

Apache License 2.0

chindimaga/SyferText