parallelization over chromosomes

Question

parallelization over chromosomes

subwaystation opened this issue 4 years ago · 4 comments

nf-core/pangenome feature request

Hi there!

Describe the solution you'd like

I want to be able to start the pipeline with a folder of FASTAs as an input. All current steps should be run on each of the FASTAs. This helps

Add new input parameter --input-folder
Make sure that --input and --input folder can't be set at the same time
Ensure that the output in the results folder reflects the naming of the input FASTA file
Add tests

Answer 1 · 2022-03-03T13:36:26.000Z

We won't parallelize over Chromosomes, but over disconnected components. Which usually are chromosomes.
Closing this. The disconnected components will be tracked by another issue.

Answer 2 · 2023-03-07T08:44:14.000Z

Markdown linting is failing

To keep the code consistent with lots of contributors, we run automated code consistency checks.
To fix this CI test, please run:

Install markdownlint-cli
- On Mac: brew install markdownlint-cli
- Everything else: Install npm then install markdownlint-cli (npm install -g markdownlint-cli)
Fix the markdown errors
- Automatically: markdownlint . --config .github/markdownlint.yml --fix
- Manually resolve anything left from markdownlint . --config .github/markdownlint.yml

Once you push these changes the test should pass, and you can hide this comment 👍

We highly recommend setting up markdownlint in your code editor so that this formatting is done automatically on save. Ask about it on Slack for help!

Thanks again for your contribution!

Answer 3 · 2024-02-12T02:12:48.000Z

Any updates on it?

Answer 4 · 2024-02-12T08:05:57.000Z

There are 2 ways parallelize:

You run the pipeline in community detection mode with --communitites. The idea is that similar and related sequences are clustered into the same community and for each community the graph construction can be run in parallel.
You split your sequences manually into chromosomal communities by a given reference and execute nf-core/pangenome for each reference community.