MDEL

Multi-Domain Expert Layers

Environment Setup

To set up the development environment, run make setup_dev. This will setup the pre-commit hooks.

Creating Expert Datasets

First, make sure you followed the Environment Setup guidelines.

To create an expert dataset using the Pile data, follow these steps:

Download the Pile shard 1 data: ./scripts/get_pile_shard1_data.sh
To set the domain, edit the variable SUBSET_NAME in scripts/create_domain_pile_mix.sh. This should be set to a valid value of the Pile's variable pile_set_name. A list of valid values can be found below.
Run the above script to process the dataset
Authenticate into Hugginface: export HF_ACCESS_TOKEN={YOUR HUGGINGFACE TOKEN}
Set the dataset name in scripts/upload_to_hf.sh
Run the above script to upload the processed dataset to HuggingFace

Pile Subsets

Pile-CC
PubMed Central
Books3†
OpenWebText2
ArXiv
Github
FreeLaw
Stack Exchange
USPTO Backgrounds
PubMed Abstracts
Gutenberg (PG-19)†
OpenSubtitles†
Wikipedia (en)†

Training Expert Models

Clone this repo and follow the Environment Setup instructions
Set up HF authentication: export HUGGING_FACE_HUB_TOKEN=[FILL ME]
Set up W&B authentication: export WANDB_API_KEY=[FILL ME]
Edit the variable DATASET in script src/mdel/train.sh to match a valid dataset name on the MDEL HF.
Run the above script in background mode to start the training: ./train.sh &
The trained model should be uploaded to the MDEL HF

References