/MultiTax-database

Primary LanguagePythonMIT LicenseMIT

MultiTax

Description

MultiTax-human is an extensive and high-resolution human-related full-length 16s rRNA reference database, designed to enhance the precision of taxonomic classification in human microbiome research and clinical applications. It integrates over 842,649 high-quality full-length 16S rRNA sequences from multiple public repositories, aiming to provide a detailed portrayal of the human microbiome. Validated across various body parts, MultiTax-human allows for the identification and analysis of core microbial taxa, promoting advances in understanding the microbiome's influence on health and disease. The database also features a user-friendly web interface for easy querying and data exploration.

Download

The MultiTax reference database and the specialized MultiTax-human database can be downloaded from Google Drive:

Usage Recommendations for Real Samples

Generate a specific amplicon database (V3-V4) using usearch:

usearch11.0.667_i86linux64 -search_pcr2 merged_human_all_H120.fasta -fwdprimer CCTACGGGNGGCWGCAG -revprimer GACTACHVGGGTATCTAATCC -minamp 400 -maxamp 550 -strand both -fastaout merged_human_all_H120_v34_hits.fa

Deduplicate data:

usearch11.0.667_i86linux64 -fastx_uniques example_sequences.fq -sizeout -relabel UniqueSeqs -fastaout example_sequences_uniq.fa

Denoise to generate OTUs:

usearch11.0.667_i86linux64 -unoise3 example_sequences_uniq.fa -zotus example_sequences_zotus.fa

Database alignment:

usearch11.0.667_i86linux64 -usearch_global example_sequences_zotus.fa -db merged_human_all_H120_v34_hits.fa -maxaccepts 0 -maxrejects 0 -strand both -top_hit_only -id 0 -blast6out example_matches.b6 -threads 75

License

This project is licensed under the MIT License.

Citation

Please cite the following publication when using MultiTax in your research:

Contact

For further inquiries or feedback, please contact us at zwbao1996@zju.edu.cn.