freedomlayer/offset

Index servers are too optimistic (fees not accounted for)

realcr opened this issue · 22 comments

Index servers

Index servers do mainly two things:

  1. Collect information about the network: For every node, index servers collect his current available credit capacity with his direct friends.
  2. Answer queries for routes of certain capacity.

Problem with current implementation

The current implementation of Index servers is too optimistic, because index servers do not take into account the fees for sending credits along routes.

For example, with the current implementation, if a node B wants to send 10 credits to a remote node E, the following route may be proposed:

B -- C -- D -- E

Where the available capacity is as follows:

  • B -> C: 10
  • C -> D: 10
  • D -> E: 10.

A route with exactly those capacities for sending credits is not sufficient for passing 10 credits, because fees needs to be paid to the intermediate nodes. Therefore, at least the following available capacities are required:

  • B -> C: 12
  • C -> D: 11
  • D -> E: 10

The implementation of the Index servers needs to be updated to take into account the payment fees to mediators.

Relevant code

The relevant code is at: components/index_server/src/graph/simple_capacity_graph.rs, which relies on the implementation at bfs.rs.

The current implementation relies on running a BFS algorithm to find a path of certain capacity from a source node to a destination node. We need to take into account the fees to the transaction mediators in this search algorithm.

Other relevant code is somewhat far away, at components/funder/src/credit_calc.rs. This is the piece of code that contains the actual calculation for fees to mediators during a transaction. The crate offst-index-server should not have the crate offst-funder as a dependency, so maybe it could be a good idea to move credit_calc.rs to offst-proto, possibly now or in the future.

Proposal for solution algorithm

It might be possible to start a graph search algorithm that goes backwards (From the destination back to the source). The search algorithm can be somewhat like BFS, but we need to make sure that doing something like this is actually sound, and we can't possibly miss valid routes or return wrong routes.

How to work on this

This is a task that requires some prior research. Please write a short proposal of what you intend to do before you start coding, hopefully I might be able to say something useful that could help you out.

I would happily work on this (or at least try). It seems to me that there are two main approaches, both requiring a reverse-BFS, but with quite different APIs and capabilities:

  1. Having the fees be passed to the graph when looking for a route. This can be either:
    1. a uint128, which we then repeatedly add, moving the code from credit_calc over; or
    2. something richer such as an iterator (of fees or of total amounts), which lets us specify all the fees along the path (an iterator should be enough because it's BFS), where we leave the code with the caller.
  2. Having the fees be (optionally) node specific. This means the index server, along with edges and capacities, maintains a fee-calculation function for each node, and uses it to calculate fees in the reverse-BFS. This can be at first done with a static fee calculator, so nothing changes in the protocol, but can be later extended so that nodes can actually somehow specify what their fees are. This might complicate the BFS (in that the cost of a path is not solely determined by its length), but I don't believe this should be difficult to solve.

@amosonn: Hi, thanks for helping out!

I would go with 1.i.
I'm still not sure if we ever add the fee selection features (By the way, we used to have them about 4 months ago I believe?). The real complication with allowing nodes to change their fees is at the protocol side, I wouldn't try to be too generic at the BFS code at this point.

I tried to think about 1.ii for a while, I can't understand it. how do you pass all the fees in the graph using an iterator? If you only pass an iterator for fees along a route, how can you select that route ahead of time (Before finding that route?)?

This might complicate the BFS

It's an interesting problem indeed. BFS with weighted distances turns into Dijkstra. I wonder if reverse dijkstra is a correct solution, or something is still missing. I haven't thought about it yet.

@pzmarzly : I hand this over to Amos. I promise to arrange you something at least as interesting as this one (:

I don't want to step on any toes, if @pzmarzly already started working on this, I'll find something different :)

In general, for all of these solutions, it seems to me that a reverse-BFS or reverse-Dijkstra (we actually need something more complicated with a heap, since we want to return all routes, but it should still work) should work just as well as the forward ones. If you look at this as searching for a path to transfer debt along, this is exactly symmetric to looking at a path to transfer credits along. Like transferring holes instead of electrons.

Regarding 1.ii., I meant something along the lines of requested_capacities: 5, 6, 7, 8, ... So you look in reverse for a path where the first edge has capacity of at least 5, the second of at least 6, etc. This is a more cumbersome API, but it leaves the control over calculating the required capacities in the hands of the credit system, I assumed there was a reason this calculation landed there originally. This does enable for instance fees which depend on the amount transferred: requested capacities: 1000, 1100, 1210, 1331, .. .

I haven't, I'll gladly leave this issue to you

Digging deeper, I arrived at another question: do we want these extra arguments to be passed over from the client as part of the request (updating proto::index_server::messages::RequestRoutes)? Or should this be implicit in the index server?

Yet another thought: who says the sender should be paying the fees? It's probably the easiest choice, other options have other considerations: if the receiver pays them, for instance, the sender has no interest in choosing a short path; if they split them half-half, finding a good path becomes even more complicated.

Still, I would say that supporting such other schemes might be something to future-proof against. Under the current protocol, can the receiver easily check what was the original amount transferred by the sender?

be passed over from the client

I forgot something very useful: There is a full and up to date design diagram of Offst here:
https://github.com/freedomlayer/offst_docs
Take a look at the diagrams directory. I plan to add this to the main documentation on this repository in the future.

Digging deeper, I arrived at another question: do we want these extra arguments to be passed over from the client as part of the request

I assume that by extra arguments you mean the fees that will be paid to each node along the way.
My suggestion is that we take this in smaller steps. We can assume for now that all the fees are always a constant 1 (Which might not be a good decision, this is indeed something to think about).

If in the future we decide for more flexible fees we can modify more things. Doing the change in RequestRoutes will take you through a very long journey all the way to the external command line interface. You probably don't want to do it on your first PR (:

Yet another thought: who says the sender should be paying the fees?

I dedicated a long time to think about who should pay the fees, my short answer is that I believe we should never allow this option, and only the sender pays the fees, mostly due to safety considerations.

My main objection follows:
Let's say that we chose some arbitrary number, like 10 percents, such that the receiver always pays 10 percent of the fees. Let's also suppose that we found a way technically to do it (I think there are also some difficulties here, but let's forget about it for a moment).

Consider this diagram:

B -- C -- D -- E

B wants to get credits for free. How to do this?
B imagines that it is actually a long path of nodes, like this:

                    B
/----------------------------------------\
B1 -- B2 -- B3 -- B4 -- ... -- B99 -- B100 -- C -- D -- E

There aren't real node there, but B can make it appear very real.
B now sends a credit from the simulated node B1 all the way to E. The transaction will cost to B about 2 credits (One for C and one for D).

From the point of view of E, the amount of fees paid in this transaction is 99 + 1 + 1 = 101 credits. So E will give about 10 credits.

Under the current protocol, can the receiver easily check what was the original amount transferred by the sender?

Yes, I'm quoting from the new proposal.
By the way, if you haven't yet checked it out, your feedback is highly appreciated.

struct RequestSendFundsOp {
        requestId @0: Uid;
        # Randomly generated reqeustId [128 bits]
        srcHashedLock @1: Hash;
        # sha512/256(srcPlainLock), where srcPlainLock is of size 256 bits.
        route @2: FriendsRoute;
        # A route of friends that leads to the destination
        destPayment @3: CustomUInt128;
        # Amount of credits to pay the destination
        invoiceId @4: InvoiceId;
        # A 256 bit value representing the invoice this request
        # intends to pay.
}

This is the first message arriving from the sender to the receiver (along the route).
destPayment is the amount of credits to be paid, route is the list of public keys of the intermediate nodes (In the past this is where I put the transaction fees too). Basically everything about the payment is known from the first request.

Still, I would say that supporting such other schemes might be something to future-proof against

I never managed to find a safe way to make the receiver pay the fees, therefore I propose to go with the simple implementation.

As a side note, you probably saw that the index server has a super naive implementation at this point:

  • It is probably not very performant (I'm sure we could do some lockless stuff there, instead currently all requests are queued and have to execute one by one).
  • If I remember correctly, only one route is returned every time (Instead of multiple routes).
  • Everything is saved in memory. Probably if the graph is large enough things will break. Maybe in the future we can have some postgresql integration.

Until we see real adoption I wouldn't bother dealing with those things at all (unless you really enjoy it)

Thanks for the detailed response!

I definitely felt there were funny things going on when the receiver pays fees, but I didn't think about making up nodes. This indeed seems broken enough that we shouldn't consider it an option.

This also means that it makes less sense for the client to pass the required fees to the server; seems more likely that the server will know better what fees are needed (e.g. in a case where nodes have custom fees). So for now, I will not pass any new argument, not even the required fixed fee, and leave that all to the server. This is definitely easier to implement ;)

Regarding the server implementation, I'll look around, if I see something low-hanging I'll change it, maybe in a different PR.

Regarding the new proposal, I read it, but didn't feel I had something to say. Maybe this means I haven't looked at it enough :)

I think I have now enough info to start writing.

@amosonn : As part of the redesign I plan to support more flexible transaction fees (According to ideas of yours and @SonOfLilit). I realize that you might have already done some work, and highly apologize for changing my mind, possibly rendering some of your work irrelevant.

If you think that the scope of this issue becomes too broad, don't worry, I can start writing a very basic prototype for the Index Server and if you want you could do the cool graph stuff when the API around stabilizes.
My plan is to start from the core (the Funder) and extend changes outside all the way to the communication protocols and API, so it could take a while.

I write here my general ideas for changes. If you ever have time to check it out please tell me what you think.

Fees parameters

My general idea was to let any node specify two values: a multiplier 0 <= m < 1 and an adder n for every outgoing edge. The fees paid for pushing credits along that edge will be mx + n.

Example:

B -- C -- D

C can configure m_{CD} = 2^-9 and n_{CD} = 5. If B wants to send a payment of 512 credits to C, then 512 * (2^-9) + 5 = 6 of fee credits will be paid to C.

Protocol changes

The part of the protocol that I expect to change is the communication between Node (Index Client) and Index Server. The rest of the protocol message will only have the results of the fees computation, and not the values m and n themselves.

Some important changes:

  • During a route request, the Index server should also get the amount of credits to be sent to find a cheap route (Because of the added multiplier value, the fees depend on the amount of transferred credit).
  • Index server could allow fragmented multi route payments (See here for more details). This means that a node could ask for a 100 credits route, and he will get back 5 routes that together allow to send 100 credits.
  • The returned route should contain the fees parameters (m,n) for every intermediate edge.

As an example, consider the route:

B -- C -- D -- E

If B requests a route to E, he could get a route of the following form from the index server:
When B asks for a route from B to E, B will get the following:

B, (C, m_{CD}, n_{CD}), (D, m{DE}, n_{DE}), E

One thought I had is to not supply the (m,n) values and instead only show the calculated sum of expected transaction fees, but I tend to decide against it because it allows less flexibility for the application.

Multiplier implementation details

I plan to have m of type u64, which always represents a fractional value between 0 (inclusive) and 1 (non inclusive). Computing the multiplication mx can be done using fixed point arithmetic, rounded to the lower value:

floor(m * x / (2^64))

I'm still not sure about the floor decision. What do you think?

Gradual fees increase

If a node increases his fees along some edge, it is likely that the next few transactions along that route will fail because the payers use old information about the fees along that edge. Therefore whenever a node wants to increase his fees (An increase in m or n) along an edge, it would be better if it first sends the increase to the Index Server, wait for a while and only then send the update to the Funder.

The reason for this is that among big players, they'd want to calculate their counterparty risk and ask different fees from different nodes at different times

Hey, thanks for jumping by! The proposed design above should allow anyone to set fees in a secure way (Setting different fees for different friends), and also dynamically change those fees in-band. This is not a very difficult modification for us to do. I really want to give the abilities to do those things to everyone, not only big players. Do you see any problem that could occur by doing that?

(if you have a lot of debt, your interest rises)

Do you actually mean interest here, or just commission for a transaction? I see those two as different things. Commission is taken only when a transaction happens, while interest has a relation to the passage of time.

About adding interest, I still have no idea how to implement it in a sound way, so having it in-band is out of the question for me at least for the next development iteration.

@realcr The changes I've started doing were mostly around, and not directly related to the implementation of the issue, so no worries there :). The entire change is probably too large for one PR, and it'd be good to split it up into several sub-tasks. The question is, whether implementing the original plan for this issue is a good first step.

My current plan for implementation was to strip down the interface of credit_calc from the current one, which requires a route length and index from the beginning, to one which only accepts an index from the end - so in current use-sites it becomes their responsibility to calculate this index-from-end, and in the index server it can be used directly in the BFS. Of course, this all won't do if the fees are specified per-node. In this case, credit_calc does need to know the actual (partial) FriendsRoute currently being used, where FriendsRoute will need to become richer and also include the fee parameters.

I can see two possible interfaces here, one where FriendsRoute is sliceable, and credit_calc calculates the fees for the entire route; the other where we keep the current indexing approach, and the BFS (actually Dijkstra) always passes index 0 and the route it's currently checking. In the simplified world of constant fees, the first interface is parallel to the index-from-end, because we just need the length of the FriendsRoute to calculate, and the second is what we currently have.

Just one more consideration, which makes the entire thing more complicated - especially in view of multi-route payments: once you switch over to fees of form m*sum + k, you lose total ordering over the "cheapness" of routes - the optimal route becomes sum-dependent (as opposed to simpler fee schemes of either m*sum or sum + k, where this doesn't happen). This makes the problem of choosing the cheapest multi-route much more difficult, since we don't know (yet) how many credits we would like to transfer along a single route. One approach would be to have credit_calc not calculate the actual fees, but rather, calculate the "equivalent parameters" of the route, giving back an (m_e, k_e) tuple. The client can then use it to directly calculate the required fees (given the sum; credit_calc doesn't need it anymore!) or do something more complicated, like finding the optimal multi-route. Technically this problem very similar to knapsack; perhaps the granularity we have is enough that a continuous-knapsack approximation will suffice.

Either way, one could argue, that in this case, it would be best to delegate the problem of optimization to the client, which it may solve in different ways. This alleviates the computational stress from the index server, but in turn, requires much more bandwidth (the server needs to notify the client of "all", or several, possible routes). I'm not sure what's the best way to make progress here, this probably depends on how complicated is this problem actually.

The changes I've started doing were mostly around, and not directly related to the implementation of the issue, so no worries there :)

I forgot to tell you, I bumped rust nightly version in use. Make sure to rebase everything and update your local rust nightly version. You can always find the current version at .travis.yml.

you lose total ordering over the "cheapness" of routes

I agree. What I had in mind is (roughly) the following communication between the node and index server:

  1. Node asks for a multi-route to send x credits to some other node.
  2. Index server returns a few cheap multi-routes.

As the index server is given the amount of credits the node wants to send, it can have total ordering ordering of all the routes.

I wrote multi-route above as a general name for both routes and multi-routes. I am still not sure if it's a good idea I am not sure if it's a good idea that the index server will deal with multi routes, but I think that it could be cool if no single route is available, and I could instead get a multi route from the index server. Thinking about it, the index server is in the best position to compose a multi route, the node knows too little to do this efficiently.

Another possibility which I think you have mentioned is to let the index server only deal with single routes, and let the node compose his own multi routes. I currently vote about letting the index server calculate multi routes, but I'm not fully sure about this.

I haven't thought at all about what would be a good algorithm to put at the index server, but I'm sure we can figure it out. (It sounds like the fun part to me at least).

My current plan for implementation was to strip down the interface of credit_calc

I think that I managed to get rid of credit_calc. Much of it is related to things that were "long time ago and incorrect". In the new atomic payments design a request contains a fees field (u128), and every node takes something from that field, subtracts it and forwards the updated request message to the next node. I only touched the Funder part, so I don't know yet what it would be like at the Index server.

I expect the index server to take periodic updates of capacities and rates ((multiplier, adder) pairs), wait for requests from nodes and respond with good routes (or multi-routes).

I'm still thinking about it, it might be reasonable that the index server will return all the (multiplier,adder) pairs for the route as well, this might allow the client reuse the route in some other way if he ever wants to.

It doesn't necessarily need to return all the (m, a) pairs for the route, it can return just the equivalent one for the route (composition of affine functions is affine). This is also the information the client needs if it wants to solve the multi-route problem locally. But I agree that we can let the server can solve it for now.

Regarding algorithms: I think you need to run a strange Dijkstra, that stores a few layers of Pareto front at each node along the way; and then once you have the Pareto front for the destination node, you can solve some kind of (continuous?) knapsack to determine the best mixture. Why a few layers? Because you need to have the maximum capacity (up to the requested amount) available, even if this means selecting sub-optimal (i.e., not on the first Pareto front) route. I think that maintaining a Pareto front at each vertex should be easy, i.e. a simple modification to the Dijkstra, because adding them is simple. I'm not sure about the multi-layered Pareto front.

I saw the nightly thing, I will rebase, and also look at what you changed in the Funder :).

It doesn't necessarily need to return all the (m, a) pairs for the route, it can return just the equivalent one for the route (composition of affine functions is affine).

You are right! What a helpful fact this is (:

I think you need to run a strange Dijkstra, that stores a few layers of Pareto front at each node along the way

I trust you that you are doing the correct thing here. Changing into the new atomic payments design takes all my time right now. I should look up Pareto front later and recall how it works. In any case, you should know that even having a much less than computationally optimal solution will be great. Everything above it is extra.

and also look at what you changed in the Funder :).

If you are referring to PR #199, please note that it doesn't fully compile. I did manage to make the core payment algorithms (mutual_credit) compile and pass the tests with the new atomic payments design, but porting everything around it will take me some time for sure, maybe even a few weeks.

@SonOfLilit :

so if making more transactions doesn't cost the company more but making the same amount of transactions with higher amounts costs linearly more, you'd want prices to depend linearly on transaction size, etc'

Hopefully we have got this covered with the new proposal for affine fees: m*x + n fees for a transaction of size x credits, where 0 <= m < 1 and n are fully configurable for every directed edge.

Banks also make you pay $25 from your pocket to any middleman node, and won't tell you in advance how many middle nodes there are (!!!)

With the proposed design the buying experience for an end user should be as follows:

  • You pick an item you want to buy.
  • You click on buying and then a cheapest route is chosen for you. The route determines the fees.
  • You get to choose if you agree to the fees or not.

This somehow breaks the rule of "simple and predictable" fees for the end user, but at least the user always knows before the payment happens exactly what the fees are going to be.

I'm sorry, apart from thinking more about the design, I haven't had time to make much progress on this, and the next two weeks I'll be away. I'll be happy to continue after that, but if this is deemed too long, and someone else wants to pick this up in the meantime, that's fair.

@amosonn : No worries, have fun on the two weeks!
Even if I type twice as fast I think that I won't be able to finish the atomic payments iteration so soon.

The new protocol supports payments using multi routes with mx + n style fees. Finding a good multi route is now a non trivial problem, so you can have some fun later (:
I included a super naive implementation of the index server just to make the compiler and the tests happy, hopefully we will soon get it to be on the master branch.

Since most end-users are going to have a "bank" node always be their first node, I think we should allow that bank node to offer the user static fees and take up the difference.

@SonOfLilit : Your last piece of advice helped me make some decisions on how the fees work on the protocol, it was very useful. Thanks! I made sure that with the correct configuration you can get the effect of user static fees.

Closed in favor of issue #218.