Train own model using own dataset

Question

Train own model using own dataset

Opened this issue a year ago · 0 comments

We have generated some screening data from some in vitro experiments. Altogether we have 196 target sequences and single-end sequencing gave us one FASTQ file.

I am trying to use the files under sample_training_codes/ to train a model, but ran into these two problems:

Since I have multiple target sequences and corresponding references, how should I config the file sample.ini?
I added these lines

    with open("sample_mutation_pattern_dict.pickle", "wb") as o:
        pickle.dump(m_pattern_dict, o)
    with open("sample_mutation_rate_dict.pickle", "wb") as o:
        pickle.dump(m_rate_dict, o)

to sample_training_codes/main.py so that two pickle files (sample_mutation_pattern_dict.pickle,
sample_mutation_rate_dict.pickle) were saved. But while using sample_models/pickles/pickle_to_csv.py to convert the two pickle files to csv file, the file is required: "/Users/hideto/Downloads/pickles/EGFP_mutation_rate_dict.pickle" which is not included in this repo.