Programmatically Build and Manage Training Data
- Snorkel website
- Snorkel tutorials
- Snorkel documentation
- Snorkel community forum
- Snorkel mailing list
- Snorkel Twitter
The quickest way to familiarize yourself with the Snorkel library is to walk through the Get Started page on the Snorkel website, followed by the full-length tutorials in the Snorkel tutorials repository. These tutorials demonstrate a variety of tasks, domains, labeling techniques, and integrations that can serve as templates as you apply Snorkel to your own applications.
Snorkel requires Python 3.6 or later. To install Snorkel, we recommend using pip
:
pip install snorkel
or conda
:
conda install snorkel -c conda-forge
For information on installing from source and contributing to Snorkel, see our contributing guidelines.
Details on installing with conda
The following example commands give some more color on installing with conda
.
These commands assume that your conda
installation is Python 3.6,
and that you want to use a virtual environment called snorkel-env
.
# [OPTIONAL] Activate a virtual environment called "snorkel"
conda create --yes -n snorkel-env python=3.6
conda activate snorkel-env
# We specify PyTorch here to ensure compatibility, but it may not be necessary.
conda install pytorch==1.1.0 -c pytorch
conda install snorkel==0.9.0 -c conda-forge
A quick note for Windows users
If you're using Windows, we highly recommend using Docker (you can find an example in our tutorials repo) or the Linux subsystem. We've done limited testing on Windows, so if you want to contribute instructions or improvements, feel free to open a PR!
We use GitHub Issues for posting bugs and feature requests — anything code-related. Just make sure you search for related issues first and use our Issues templates. We may ask for contributions if a prompt fix doesn't fit into the immediate roadmap of the core development team.
We welcome contributions from the Snorkel community! This is likely the fastest way to get a change you'd like to see into the library.
Small contributions can be made directly in a pull request (PR).
If you would like to contribute a larger feature, we recommend first creating an issue with a proposed design for discussion.
For ideas about what to work on, we've labeled specific issues as help wanted
.
To set up a development environment for contributing back to Snorkel, see our contributing guidelines. All PRs must pass the continuous integration tests and receive approval from a member of the Snorkel development team before they will be merged.
For broader Q&A, discussions about using Snorkel, tutorial requests, etc., use the Snorkel community forum hosted on Spectrum. We hope this will be a venue for you to interact with other Snorkel users — please don't be shy about posting!
To stay up-to-date on Snorkel-related announcements (e.g. version releases, upcoming workshops), subscribe to the Snorkel mailing list. We promise to respect your inboxes — communication will be sparse!
Follow us on Twitter @SnorkelML.