Some thoughts on payment splitting heuristics for multi part payments

Question

Some thoughts on payment splitting heuristics for multi part payments

renepickhardt opened this issue a year ago · 1 comments

This is based on some out of band communication with @TheBlueMatt:

In #1276 there was the damand to revisit the splitting heuristic of payments in LDK. Back at that day I argued that the splitting problem is exactly what whould be solved throuph optimally reliable payment flows which are proposed to be added in #1668. However I think I have to withdraw or at least weaken some of my claims from before as I will lay out in this issue.

The following originated from the comments by @TheBlueMatt on the mailinglist :

While its much less capital-effecient, the ability to over-commit upfront and then only allow the
recipient to claim a portion of the total committed funds would substantially reduce the impact of
failed HTLCs on payment latency. Of course the extra round-trip to request the "unlock keys" for the
correct set of HTLCs adds a chunk to total latency, so senders will have to be careful about
deciding when to do this or not.

Still, now that we have onion messages, we should do (well, try) this! Its not super complicated to
implement (like everything it seems, the obvious implementation forgoes proof-of-payment, and like
everything the obvious solution is PTLCs, I think). Its not clear to me how we get good data from
trials, though, we'd need a sufficient set of the network to support this that we could actually
test it, which is hard to get for a test.

Maybe someone (anyone?) wants to do some experiments doing simulations using real probing success
rates to figure out how successful this would be and propose a concrete sender strategy that would
improve success rates.

While digging into redundant overpayments I figured out several noteworthy things:

Minimum cost flows may not yield the maximum expected value to be delivered. A counterexample can be seen at: https://github.com/renepickhardt/Maximum-Expected-Value-Flows-For-Redundant-Overpayments/blob/main/img/counterExample.png
Redundant overpayments can be understood as: "At least k out of n payment parts need to successfully reach the recipient". Thus the method of bernoulli trials can be applied (as orginially suggested in the probabilistic path finding paper for non redundant payments)
While in the above mentioned paper I made poor / strange assumptions to the cost function and channel size to develope the theory I now observed that the only neccessary variable for a bernoulli trial is that the success probability for each path has the be (roughly) the same which can be controlled for by defining a target probability for the various paths and assign the sats for that onion / path such that the estimated success probability reaches the target probability.
These observations lead to a natural splitting heuristic - which seems reasonable (in particular but not only for redundant overpayments) and useful in simulations but I don't have the infrastructure or resources to test this in mainnet.

Summary of the splitting algorithm

This is currently WIP and many choices are adhoc but it shows the princible. the foolowing is a summary of: renepickhardt/pickhardtpayments#36

We use dijkstra to compute candidate paths as shortest paths with respect to our cost function

Use a cost function for a unit that is the combination of linearized routing cost and linearized uncertainty cost
smooth the routing cost (currently laplace smoothing with +100. The exact value needs to be found through tests and feature engineering
thus: cost = (ppm + 100) * 1/capacity
After a path is planned increase cost of each edge with a multiplier

with respect to allocation of funds to paths

The cost function favors paths with low routing costs thus the only question is how many sats to allocate?
The following idea shall be used for motivation:

We want each path to have a success probability of at least x%
Thus let l be the length of the planned path and c be the capacity of the smallest channel.
Then s = (c+1-a)/(c+1) is the success probability for the allocated amount a (Of course one does not have to use a uniform distribution to estimate the success probabilities but use a different prior. the following fomulars would however change)
We require s >= x ** (1/l) which means the path has a probability of at least x
For now we ignore prior knowledge and believe (Thus one reason for this to be WIP)
knowing s at equality we solve for a --> a = (c+1) - s*(c+1) = (c+1)*(1-s)
we allocate a sats to the candidate path

Comparison to existing splitters

Existing splitters operate by the following ideas

divide by 2 if the amount seems to large after several attampts (I think this is the LND rational)
define a fixed number of parts one wishes to have and divide into that many parts (I think this is LDK)
define a payment size for the parts and split down to that size ( this is one of the rationals by core lightning)

Non of the above mentioned ideas start with reliability and can be used to derrive service level objectives. With the method that I proposed one can start with a clear service level objective and get single parts that have all the same success probability and obviously the single parts can be derrived with cost functions that focus more on fees or more on reliability or any other feature that one may find desireable

While this is not directly an issue I thought I drop this here as there was previously the discussion about splitting heuristics and as mentioned I have to withdraw some of by too strong claims statements.

Answer 1 · 2023-05-09T19:07:15.000Z

Mmm, interesting, yea, that's nifty, thanks! We could probably do this by pushing the decision of how much to send over a channel into the scorer and then the router just does whatever the scorer says. From there we could configure the scorer to do the above calculation, or something else based on whatever its probability knob says.

One related thing is I want to move to a nonlinear probability distribution soon. Been playing with it a bit and absent a better idea will probably do a distribution of something like (a/c - 0.5)**4 (or 6 or 8, dunno which exponent yet), mostly because its super easy to calculate the integral of in nice fast fp math (the ^4 one implies 3 fp multiplies to get a ^5 for the integral or 6 or 8 are 4 fp multiplies). I'm not quite sure how to fit that into the above in a way that's fast but I also haven't thought deeply about it, and we could change to a different function as long as we can calculate it fast.