examples of backup policy where keys are greater than uint128_t
rmanaloto-tastytrade opened this issue · 2 comments
in the FAQ it mentions a backup policy/customization point can be setup for the lookup implementation. Do you have examples of this. Your template metaprogramming skills are way above mine, so I'm not quite sure how to pull it off. Seems like I need a specialization on the find/lookup method?
And there's a section in the FAQ also regarding setting the probability and cache size alignment. Do you have examples for this also?
This is currently, an example where i need the workaround (I'm working on binary encoder/decoder for the CME exchanges binary protocol (simple binary encoding/iLink3)
https://godbolt.org/z/GxTeabvdx
we chatted previously regarding your reflect library
This is the second part of where i am going w my implementation. I am using the reflect library to get me the total number of enums. I will then pair that with another compile time map where I can override the to_string output based on different systems. I just hit this snag so I just isolated it down to where it is having compilation issues.
Thanks for the info, the compilation issues are because the mask can't be found at compile-time for given enum strings as they are too long, there is a way to customize, but not on underlying element bases bur rather of entry set properties - https://godbolt.org/z/dnc4x8hTT.
Anyway, the way I'd approach this particular example is by just using last 8 characters and just grab as each message can be uniquely identified by that which can be checked at compile-time.
Long story short, the following is a simple example how it can be approached.
// get the short enum names
template<class T, std::size_t N = sizeof(std::uint64_t)> requires std::is_enum_v<T>
constexpr auto enumerators = [] {
std::array<std::pair<std::string_view, std::underlying_type_t<T>>, reflect::enumerators<T>.size()> enumerators{};
for (auto i = 0u; const auto& [enumerator, name] : reflect::enumerators<T>) {
const auto enum_name = std::string_view{name};
enumerators[i++] = std::pair{enum_name.substr(enum_name.size() - N), enumerator};
}
return enumerators;
}();
// input data
template<auto Size, std::size_t N = sizeof(std::uint64_t)> // may need different overloads depending on the input data
[[nodiscard]] constexpr auto data(const char (&str)[Size]) { return &str[Size - N - 1u]; }
// safe lookup
inline constexpr auto find = mph::find<enumerators<MessageIds>>;
static_assert(0 == *find(data("Negotiate500")));
static_assert(1 == *find(data("NegotiationResponse501")));
static_assert(2 == *find(data("NegotiationReject502")));
static_assert(3 == *find(data("Establish503")));
static_assert(4 == *find(data("EstablishmentAck504")));
static_assert(5 == *find(data("EstablishmentReject505")));
static_assert(6 == *find(data("Sequence506")));
static_assert(7 == *find(data("Terminate507")));
static_assert(8 == *find(data("RetransmitRequest508")));
static_assert(9 == *find(data("Retransmission509")));
static_assert(10 == *find(data("RetransmitReject510")));
static_assert(11 == *find(data("NotApplied513")));
static_assert(12 == *find(data("NewOrderSingle514")));
static_assert(13 == *find(data("OrderCancelReplaceRequest515")));
static_assert(14 == *find(data("OrderCancelRequest516")));
static_assert(15 == *find(data("MassQuote517")));
static_assert(16 == *find(data("PartyDetailsDefinitionRequest518")));
static_assert(17 == *find(data("PartyDetailsDefinitionRequestAck519")));
static_assert(18 == *find(data("BusinessReject521")));
static_assert(19 == *find(data("ExecutionReportNew522")));
static_assert(20 == *find(data("ExecutionReportReject523")));
static_assert(21 == *find(data("ExecutionReportElimination524")));
static_assert(22 == *find(data("ExecutionReportTradeOutright525")));
static_assert(23 == *find(data("ExecutionReportTradeSpread526")));
static_assert(24 == *find(data("ExecutionReportTradeSpreadLeg527")));
static_assert(25 == *find(data("QuoteCancel528")));
static_assert(26 == *find(data("OrderMassActionRequest529")));
static_assert(27 == *find(data("OrderMassStatusRequest530")));
static_assert(28 == *find(data("ExecutionReportModify531")));
static_assert(29 == *find(data("ExecutionReportStatus532")));
static_assert(30 == *find(data("OrderStatusRequest533")));
static_assert(31 == *find(data("ExecutionReportCancel534")));
static_assert(32 == *find(data("OrderCancelReject535")));
static_assert(33 == *find(data("OrderCancelReplaceReject536")));
static_assert(34 == *find(data("PartyDetailsListRequest537")));
static_assert(35 == *find(data("PartyDetailsListReport538")));
static_assert(36 == *find(data("ExecutionAck539")));
static_assert(37 == *find(data("RequestForQuote543")));
static_assert(38 == *find(data("NewOrderCross544")));
static_assert(39 == *find(data("MassQuoteAck545")));
static_assert(40 == *find(data("RequestForQuoteAck546")));
static_assert(41 == *find(data("ExecutionReportTradeAddendumOutright548")));
static_assert(42 == *find(data("ExecutionReportTradeAddendumSpread549")));
static_assert(43 == *find(data("ExecutionReportTradeAddendumSpreadLeg550")));
static_assert(44 == *find(data("SecurityDefinitionRequest560")));
static_assert(45 == *find(data("SecurityDefinitionResponse561")));
static_assert(46 == *find(data("OrderMassActionReport562")));
static_assert(47 == *find(data("QuoteCancelAck563")));
static_assert(48 == *find(data("ExecutionReportPendingCancel564")));
static_assert(49 == *find(data("ExecutionReportPendingReplace565")));
Produced assembly
$find(unsigned long): # @"$find(unsigned long)"
movabs rax, 434315889064543778
pext rax, rdi, rax
shl eax, 4
lea rcx, [rip + mph::v5_0_1::type_traits::constant_t<mph::v5_0_1::find$pext<enumerators<MessageIds, 8ul>>>::value]
cmp qword ptr [rax + rcx], rdi
sete al
ret
find:
/// lookup table
That can be further optimized as I think that last 4 characters are enough in such case. The data call may need specialization on the input used. If only valid keys (only enum names) will be used that can be optimized further with probability 100 for find or with lookup. If input strings are string literals pointers could be used instead for comparison but it depends on the use case.
So the lookup (unsafe) would look as follows:
$lookup(unsigned long): # @"$lookup(unsigned long)"
movabs rax, 434315889064543778
pext rax, rdi, rax
lea rcx, [rip + lookup]
movzx eax, byte ptr [rax + rcx]
ret
lookup:
.ascii "\000\000\000\000\000\000\000\t\000\000\b\000\000\000\001\000\000\000\000\000\000\000\n\000\000\000\020\021\000\000\000\000\000\000\000\000\000\000\022\000\000\000\032\000\000\000\031\000\000\000#\000\000\000\000\000\000\000\033\034\000\000\000$\000\000\000)\000\000*\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000+\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000,\000\000\000-\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\002\000\000\000\000\003\000\000\000\000\000\000\000\000\000\000\000\000\000\000\013\000\000\000\000\000\000\000\000\000\000\000\024\000\000\000\000\023\000\000\000\000\000\000\035\000\000\000\000\000\000\000\036\000\000\000\000\000\000\000\000\000\000\000\000\000%\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000.\000\000\000\000\000\000\000/\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\005\000\000\000\000\004\000\000\000\000\000\000\000\000\000\000\000\000\000\000\r\000\000\000\f\000\000\000\000\026\000\000\000\000\000\000\025\000\000\000\000\000\000\000\000\000\000\000 \000\000\000\000\000\000\000\037\000\000\000\000\000\000&\000\000\000\000'\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\0001\000\000\0000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\006\000\000\000\007\000\000\000\000\000\000\000\000\000\017\000\016\000\000\000\000\000\000\000\000\000\000\000\027\000\000\000\030\000\000\000\000\000\000\000\000\000\000\000!\000\000\000\"\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000("
Full example - https://godbolt.org/z/rrbfMrY66
If you are comfortable with last 4 characters that will make the lookup tables a bit smaller but it's also a bit more unsafe - https://godbolt.org/z/vsbTqaKoK
The following will the x86-64 assembly for gcc
$find(unsigned int):
mov edx, 252116992
pext edx, edi, edx
mov edx, edx
movzx eax, BYTE PTR mph::v5_0_1::type_traits::constant_t<mph::v5_0_1::find$pext<enumerators<MessageIds, 4ul> > >::value[4+rdx*8]
cmp DWORD PTR mph::v5_0_1::type_traits::constant_t<mph::v5_0_1::find$pext<enumerators<MessageIds, 4ul> > >::value[0+rdx*8], edi
mov edx, 0
cmovne eax, edx
ret
$lookup(unsigned int):
mov eax, 252116992
pext edi, edi, eax
mov edi, edi
movzx eax, BYTE PTR lookup[rdi]
ret
Full example - https://godbolt.org/z/83vGhY7M8
Thank you very much for the quick response. I'm reviewing this now and will give it a shot later today. I'll send back an update afterwards.
Have you considered exposing a runtime version of this library? It seems like most of the functionality is there to support a runtime version (with maybe a slightly slower performance).
One use case for a runtime version for trading is where the universe of securities to trade on an exchange/venue is not known at compile time, but at system startup. The full security list is usually loaded up via a config file or loading it up via a market data feed at the start of the session.
So having a universal api to handle both use cases would be super useful. We do the runtime mph for the security list using the cmph library. It works, but that library is pretty old and being able to take advantage of your pext logic for small strings (equity symbols should satisfy the uint128_t requirement) would help optimize our lookups.