Alphabet conversion is not properly constrained
rrahn opened this issue · 5 comments
Does this problem persist on the current main?
- I have verified the issue on the current main
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
Explicit casting of a bitpacked proxy type to its actual value type causes a compile time error.
See #3264 for more details.
Expected Behavior
It compiles.
Steps To Reproduce
#include <vector>
#include <seqan3/alphabet/container/bitpacked_sequence.hpp>
#include <seqan3/alphabet/nucleotide/dna4.hpp>
using namespace seqan3::literals;
seqan3::bitpacked_sequence<seqan3::dna4> source{"ACGGTCAGGTTC"_dna4};
auto it = source.begin();
seqan3::dna4 val = seqan3::dna4{*it}; // compile error - due to explicit conversion
Environment
- Operating system: macOS 14.5
- SeqAn version: main
- Compiler: gcc 13.3
Anything else?
One problem is this code:
seqan3/include/seqan3/alphabet/detail/convert.hpp
Lines 32 to 44 in ca4d668
Here, we only require in_t
to be an alphabet type but this does not include that in_t
is also default constructible. However, inside of the initialization of the conversion table the default constructor of in_t
is called, causing the seen compile error.
To fix this the in_t
must be constrained with std::default_initializable
as well.
diff --git a/include/seqan3/alphabet/detail/convert.hpp b/include/seqan3/alphabet/detail/convert.hpp
index c3ebda2a5..8405d6a9a 100644
--- a/include/seqan3/alphabet/detail/convert.hpp
+++ b/include/seqan3/alphabet/detail/convert.hpp
@@ -30,6 +30,7 @@ namespace seqan3::detail
* \hideinitializer
*/
template <alphabet in_t, alphabet out_t>
+ requires std::default_initializable<in_t>
constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation
{
[]() constexpr {
diff --git a/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp b/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
index f1c7bb054..5f97b42bc 100644
--- a/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
+++ b/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
@@ -77,6 +77,7 @@ public:
template <typename other_nucl_type>
requires (!std::same_as<nucleotide_base, other_nucl_type>)
&& (!std::same_as<derived_type, other_nucl_type>) && nucleotide_alphabet<other_nucl_type>
+ && std::default_initializable<other_nucl_type>
explicit constexpr nucleotide_base(other_nucl_type const & other) noexcept
{
static_cast<derived_type &>(*this) =
should work.
Unfortunately, it's not enough to constrain the array (actually, we don't have to), we also have to constrain the ctor.
We probably have this ctor in multiple alphabets...
Maybe there is a way to construct the array if in_t
is not default initializable?
Without confirming via debugging, I figure that this is the implicit conversion:
seqan3/include/seqan3/alphabet/detail/alphabet_proxy.hpp
Lines 173 to 201 in ca4d668
seqan::bitpacked_sequence::reference_proxy_type
inherits from alphabet_proxy
.
seqan3/include/seqan3/alphabet/nucleotide/nucleotide_base.hpp
Lines 77 to 84 in ca4d668
is the explicit ctor.
By constraining this one, we fall back to implicit conversion (Unless I'm mixing up conversion/casting rules).
I can confirm these findings. And correctly constraining the explicit constructor should do it. To fix this, I am planning to add a detail concept called convertable_through_char_representation
that should be used for these constructors.
Something like that would also work, though it's not that nice...
diff --git a/include/seqan3/alphabet/detail/alphabet_proxy.hpp b/include/seqan3/alphabet/detail/alphabet_proxy.hpp
index 7be8044a8..953a59cd4 100644
--- a/include/seqan3/alphabet/detail/alphabet_proxy.hpp
+++ b/include/seqan3/alphabet/detail/alphabet_proxy.hpp
@@ -133,6 +133,8 @@ public:
//!\brief The alphabet size.
static constexpr auto alphabet_size = seqan3::alphabet_size<alphabet_type>;
+ using alphabet_t = alphabet_type;
+
/*!\name Write functions
* \brief All of these call the emulated type's write functions and then delegate to
* the assignment operator which invokes derived behaviour.
diff --git a/include/seqan3/alphabet/detail/convert.hpp b/include/seqan3/alphabet/detail/convert.hpp
index c3ebda2a5..0c172be14 100644
--- a/include/seqan3/alphabet/detail/convert.hpp
+++ b/include/seqan3/alphabet/detail/convert.hpp
@@ -30,18 +30,19 @@ namespace seqan3::detail
* \hideinitializer
*/
template <alphabet in_t, alphabet out_t>
-constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation
-{
- []() constexpr {
+constexpr std::array<out_t, alphabet_size<in_t>> convert_through_char_representation{
+ []() constexpr
+ {
std::array<out_t, alphabet_size<in_t>> ret{};
- // for (decltype(alphabet_size<in_t>) i = 0; ...) causes indefinite compilation :(
- for (auto i = decltype(alphabet_size<in_t>){0}; i < alphabet_size<in_t>; ++i)
- assign_char_to(to_char(assign_rank_to(i, in_t{})), ret[i]);
+ static_assert(std::default_initializable<in_t> || requires { typename in_t::alphabet_t; });
+ using value_t = std::conditional_t<std::default_initializable<in_t>, in_t, typename in_t::alphabet_t>;
+
+ for (auto i = decltype(alphabet_size<value_t>){0}; i < alphabet_size<value_t>; ++i)
+ assign_char_to(to_char(assign_rank_to(i, value_t{})), ret[i]);
return ret;
- }()
-};
+ }()};
// clang-format on
} // namespace seqan3::detail
Sure, that seems valid. However, it feels too specific to seqan3. In general, any third party alphabet proxy should work as well but now we need to constraint that the proxy type exposes the alphabet_t type for which in_t
is a proxy for. And then we would also need to check whether that type is also default_initializable, since alphabets to not require this.