Compiling with uint64_t
ohickl opened this issue · 5 comments
Hi, I am trying to compile MetaCache for use with a large reference data set.
I am running it like this:
git clone https://github.com/muellan/metacache.git
cd metacache
mamba activate compile
LD_LIBRARY_PATH=${CONDA_PREFIX}/lib:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH
CPLUS_INCLUDE_PATH=${CONDA_PREFIX}/include:${CPLUS_INCLUDE_PATH}
export CPLUS_INCLUDE_PATH
make MACROS="-DMC_TARGET_ID_TYPE=uint64_t -DMC_WINDOW_ID_TYPE=uint64_t -DMC_KMER_TYPE=uint64_t"
I get the following error:
make release_dummy DIR=build_release ARTIFACT=metacache MACROS="-DMC_TARGET_ID_TYPE=uint64_t -DMC_WINDOW_ID_TYPE=uint64_t -DMC_KMER_TYPE=uint64_t"
make[1]: Entering directory '/mnt/data/local_tools/metacache'
.../miniconda3/envs/compile/bin/x86_64-conda-linux-gnu-c++ -DMC_TARGET_ID_TYPE=uint64_t -DMC_WINDOW_ID_TYPE=uint64_t -DMC_KMER_TYPE=uint64_t -std=c++14 -Wall -Wextra -Wpedantic -I/include -O3 -c src/building.cpp -o build_release/building.o
In file included from src/options.h:31,
from src/candidate_structs.h:28,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
src/taxonomy.h: In member function 'void mc::ranked_lineages_of_targets::update(mc::target_id)':
src/taxonomy.h:980:45: error: no matching function for call to 'min(unsigned int, long unsigned int)'
980 | const unsigned numThreads = std::min(4U, numNewTargets / (1U << 10));
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/char_traits.h:39,
from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/string:40,
from src/bitmanip.h:29,
from src/dna_encoding.h:27,
from src/hash_dna.h:27,
from src/config.h:34,
from src/candidate_structs.h:27,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algobase.h:230:5: note: candidate: 'template<class _Tp> constexpr const _Tp& std::min(const _Tp&, const _Tp&)'
230 | min(const _Tp& __a, const _Tp& __b)
| ^~~
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algobase.h:230:5: note: template argument deduction/substitution failed:
In file included from src/options.h:31,
from src/candidate_structs.h:28,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
src/taxonomy.h:980:45: note: deduced conflicting types for parameter 'const _Tp' ('unsigned int' and 'long unsigned int')
980 | const unsigned numThreads = std::min(4U, numNewTargets / (1U << 10));
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/char_traits.h:39,
from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/string:40,
from src/bitmanip.h:29,
from src/dna_encoding.h:27,
from src/hash_dna.h:27,
from src/config.h:34,
from src/candidate_structs.h:27,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algobase.h:278:5: note: candidate: 'template<class _Tp, class _Compare> constexpr const _Tp& std::min(const _Tp&, const _Tp&, _Compare)'
278 | min(const _Tp& __a, const _Tp& __b, _Compare __comp)
| ^~~
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algobase.h:278:5: note: template argument deduction/substitution failed:
In file included from src/options.h:31,
from src/candidate_structs.h:28,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
src/taxonomy.h:980:45: note: deduced conflicting types for parameter 'const _Tp' ('unsigned int' and 'long unsigned int')
980 | const unsigned numThreads = std::min(4U, numNewTargets / (1U << 10));
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/algorithm:62,
from src/../dep/hpc_helpers/include/cuda_helpers.cuh:7,
from src/dna_encoding.h:30,
from src/hash_dna.h:27,
from src/config.h:34,
from src/candidate_structs.h:27,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algo.h:3449:5: note: candidate: 'template<class _Tp> constexpr _Tp std::min(std::initializer_list<_Tp>)'
3449 | min(initializer_list<_Tp> __l)
| ^~~
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algo.h:3449:5: note: template argument deduction/substitution failed:
In file included from src/options.h:31,
from src/candidate_structs.h:28,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
src/taxonomy.h:980:45: note: mismatched types 'std::initializer_list<_Tp>' and 'unsigned int'
980 | const unsigned numThreads = std::min(4U, numNewTargets / (1U << 10));
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from .../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/algorithm:62,
from src/../dep/hpc_helpers/include/cuda_helpers.cuh:7,
from src/dna_encoding.h:30,
from src/hash_dna.h:27,
from src/config.h:34,
from src/candidate_structs.h:27,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algo.h:3455:5: note: candidate: 'template<class _Tp, class _Compare> constexpr _Tp std::min(std::initializer_list<_Tp>, _Compare)'
3455 | min(initializer_list<_Tp> __l, _Compare __comp)
| ^~~
.../miniconda3/envs/compile/x86_64-conda-linux-gnu/include/c++/11.3.0/bits/stl_algo.h:3455:5: note: template argument deduction/substitution failed:
In file included from src/options.h:31,
from src/candidate_structs.h:28,
from src/candidate_generation.h:27,
from src/database.h:27,
from src/building.h:27,
from src/building.cpp:24:
src/taxonomy.h:980:45: note: mismatched types 'std::initializer_list<_Tp>' and 'unsigned int'
980 | const unsigned numThreads = std::min(4U, numNewTargets / (1U << 10));
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[1]: *** [Makefile:215: build_release/building.o] Error 1
make[1]: Leaving directory '/mnt/data/local_tools/metacache'
make: *** [Makefile:137: release] Error 2
Just make
or e.g. make MACROS="-DMC_TARGET_ID_TYPE=uint32_t -DMC_WINDOW_ID_TYPE=uint32_t"
works.
Am I calling it somehow wrong? make MACROS="-DMC_KMER_TYPE=uint64_t"
from the example also fails in the same manner.
Best
Oskar
Hi,
that was a bug which was luckily easy to fix.
If you update to the latest release everything should work fine.
Thanks, compilation works now!
I do still have window id type unsigned int 32 bits
during a test database build attempt though,
after compiling with make MACROS="-DMC_TARGET_ID_TYPE=uint64_t -DMC_WINDOW_ID_TYPE=uint64_t -DMC_KMER_TYPE=uint64_t"
:
Building new database '.../databases/metacache/mc_build_test/mc_build_test' from reference sequences.
Max locations per feature set to 254
Reading taxon names ... done.
Reading taxonomic node mergers ... done.
Reading taxonomic tree ... 2564271 taxa read.
Taxonomy applied to database.
------------------------------------------------
MetaCache version 2.3.1 (20230309)
database version 20200820
------------------------------------------------
sequence type mc::char_sequence
target id type unsigned long int 64 bits
target limit 18446744073709551615
------------------------------------------------
window id type unsigned int 32 bits
window limit 4294967295
window length 127
window stride 108
------------------------------------------------
sketcher type mc::single_function_unique_min_hasher<unsigned long, mc::same_size_hash<unsigned long> >
feature type unsigned long int 64 bits
feature hash mc::same_size_hash<unsigned long>
kmer size 20
kmer limit 32
sketch size 16
------------------------------------------------
bucket size type unsigned char 8 bits
max. locations 254
location limit 254
------------------------------------------------
Reading sequence to taxon mappings from .../mc_build_test/assembly_summary.txt
Reading sequence to taxon mappings from .../ncbi_taxonomy/assembly_summary_refseq.txt
Reading sequence to taxon mappings from .../ncbi_taxonomy/assembly_summary_refseq_historical.txt
Reading sequence to taxon mappings from .../ncbi_taxonomy/assembly_summary_genbank.txt
Reading sequence to taxon mappings from .../ncbi_taxonomy/assembly_summary_genbank_historical.txt
Processing reference sequences.
Added 29652 reference sequences in 415.539 s %
targets 29652
ranked targets 29652
taxa in tree 2564271
------------------------------------------------
buckets 964032481
bucket size max: 254 mean: 1.31966 +/- 3.95271 <> 42.6348
features 705592365
dead features 0
locations 931144669
------------------------------------------------
All targets are ranked.
Writing database to file ... Writing database metadata to file '.../databases/metacache/mc_build_test/mc_build_test.meta' ... done.
Writing database part to file '.../databases/metacache/mc_build_test/mc_build_test.cache0' ... done.
done.
Total build time: 625.363 s
Build command:
${p2mc}/metacache build ${p2d}/metacache/mc_build_test/mc_build_test \
${p2d}/mc_build_test \
-taxonomy ${p2taxdb} \
-kmerlen 20
yeah, there was another bug that was probably introduced with the GPU version - also fixed now.
I didn't make another release for this, just git pull
the latest changes from the repository.
Thanks, works!
I do have a few questions regarding database partitioning and parameters for very large reference data sets, should I open a new issue for that?
yes, a new issue would be better