One can install the package directly from the github
repository or install it from the local cloned version.
The base installation "mavebay"
avoids additional packages such as matplotlib
, arviz
, logomaker
and etc.
The full installation which is suitable to run demos is also provided "mavebay[examples]"
.
- Create the virtual environment.
- Install the base or full
mavebay
package.
python -m venv test_mavebay
source test_mavebay/bin/activate
pip install git+https://github.com/mahdikooshkbaghi/mavebay "mavebay[examples]"
- Clone the repo.
- Create the virtual environment.
- Install the base or full
mavebay
package.
git clone git@github.com:mahdikooshkbaghi/mavebay.git
cd mavebay
python -m venv test_mavebay
source test_mavebay/bin/activate
pip install . "mavebay"
# OR
pip install . "mavebay[examples]"
The global epistasis (GE) measurement process example script is provided in the example
folder.
The following command can be used to run demos
python global_epistasis_demo.py -n [NUM_SAMPLES] \
-lr [LEARNING_RATE] \
-m [METHOD] \
-ds [DATA] \
-k [INTERACTION_ORDER] \
-d [DEVICE] \
-i [INIT_LOC_FN] \
-p [PROGRESS_BAR]
All the arguments has some default values which are provided in the script.
NUM_SAMPLES
: number of samples formcmc
or number of steps in thesvi
.LEARNING_RATE
: the learning rate for the optimizer insvi
method.METHOD
: method of inference:mcmc
orsvi
DATA
: dataset to use for the inference. Descriptions of the datasets are given in the MAVE-NN manuscript.abeta
: DMS data for Aβ (default dataset).tdp43
: DMS data for TDP-43.mpsa
: MPSA data for 5' splicing sites.
INTERACTION_ORDER
:k=1 (default)
corresponds to the additive GP map,k=2
pairwise GP map and so on.DEVICE
:cpu (default)
orgpu
.INIT_LOC_FN
: initialization for thesvi
sampling. Default itfeasible
. Others can be assigned based on thenumpyro
documentation.PROGRESS_BAR
: enable (default) or disable the progress bar of the inference.
Bayesian Version of MAVENN1.0
- The SVI is working both on the abeta and TDP-43 additive inferences.
- The MCMC on the abeta is working only on small samples.
- Need to figure out the batch and plate.
- Need to modify the
setup.py
to have additional python requirements for the example folder. Check the numpyro github repo for hint. - Implementing the skewed-T noise model similar to one we have in MAVENN.
- K-th order interaction GP map implementation.
- The K-th=1: which is practically an additive model and it is working.
- The K-th=2: which is practically pairwise works on mpsa data.
- Measurement process agnostic (MPA) implementation.
- Need to add different initialization strategy for SVI sampling with default being
init_to_feasible
- Saving the model:
- SVI
- MCMC
- Make the ppc smooth.
- As I suspected the number of samples from posteriors were not enough to make the phi_to_yhat smooth. Increasing that fixed the issue.
- Put the MAVENN heatmap and pairwise to utils function
- Information metrics calculation.