/raptor_data_simulation

Primary LanguageC++BSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Raptor utility repository

This contains small apps.

git clone --recurse-submodules https://github.com/eseiler/raptor_data_simulation
cd raptor_data_simulation
mkdir build
cd build
cmake ..
make -j2 install

There is a script at src/simulate.sh that simulates a dataset.
Variables in upper case can be changed.
BINARY_DIR should be the absolute path to the build/bin directory.
OUT_DIR should be the absolute path to the output directory.

LENGTH % BIN_NUMBER should be 0
(LENGTH / BIN_NUMBER) % HAPLOTYPE_COUNT should be 0
READ_COUNT % BIN_NUMBER should be 0
The easiest way to achieve this is to set LENGTH, BIN_NUMBER. and READ_COUNT to a power of two.

For

OUT_DIR=/some/path
BIN_NUMBER=16384
ERRORS=2
READ_LENGTHS="100 150 250"

, the result will look like

/some/path/
└── 16384
    ├── bins
    ├── info
    ├── reads_e2_100
    ├── reads_e2_150
    └── reads_e2_250