coniks-sys/coniks-go

Keeping the client in sync

arlolra opened this issue · 10 comments

This proposal addresses #144, #119, and various other things we've been calling the time skew issue. It's basically a summary of what was discussed in the hangout today.

The main idea being that the epoch increases in lock step with monitoring, and all subsequent requests happen in epoch, which avoids the edge cases where a request may happen in an epoch that has yet to be verified. Consequently, only monitoring updates the saved STR and the API for looking up in the latest epoch should be eliminated.

The first thing the client should do at startup is monitor, sending the epoch it knows about so that the server only needs to return STRs in the interval. The server can also send the time until the next epoch, for the client to set a timer. The client should be sure to check that the time is less than or equal to that attested to in the latest STR.

There's a bit of trickiness around the registration request (since it also needs to happen in epoch, as we said above). If the client is yet to register, the monitoring request at startup should be checking proof of absence. Since registration has to happen in the latest epoch, if the server receives a request to register in an outdated epoch, it should return an error, and give the client a chance to defer / re-request with exponential backoff. Alternatively, the client can set a monitoring flag to true and prematurely queue other requests until that's done.

Thoughts?

vqhuy commented

the API for looking up in the latest epoch should be eliminated.

What if the looking up key is in TBs queue? It happens when the registration and the key lookup are in the same epoch or when we support the key change operation.

The server can also send the time until the next epoch, for the client to set a timer.

You mean the expected time the next epoch will be published, right?

Alternatively, the client can set a monitoring flag to true and prematurely queue other requests until that's done.

Queuing seems better to me.

My main consideration is how can we put this flow into the library. I think most of these have to be done from the application level (but we can temporarily ignore this for now).

What if the looking up key is in TBs queue?

Very good point; this needs some thought. Initial ideas could be,

  1. Return the TB if the epoch being looked up in happens to be the latest;
  2. Keep a storage for TBs so that they can always be returned for an epoch.

You mean the expected time the next epoch will be published, right?

Yes; well, it would be time_to_next_epoch = time_of_next_epoch - current_time so that what's returned can be put in a timer for when to perform the next monitoring operation. time_to_next_epoch should be less than or equal to the EpochDeadline (as it's currently defined, an interval). Somewhat related is #81

vqhuy commented

The downside is that every request other than monitoring (mostly key lookup) has to be queued until the client catches up with the latest epoch, which may cause some (noticeable) latency for the users. This probably needs to be measured by testing and benchmarks.

it would be time_to_next_epoch = time_of_next_epoch - current_time so that what's returned can be put in a timer for when to perform the next monitoring operation.

I'm not sure if the client can trust the time_to_next_epoch alone to set up the timer, since a malicious server can always return the same valid time_to_next_epoch which causes the timer run forever? This makes me think that the root cause of the issue, which is to verify whether the STR is published at the right time, is still unsolved.

Then, do we really need to make monitoring happen before every other operation?

The downside is that every request other than monitoring (mostly key lookup) has to be queued until the client catches up with the latest epoch

Not necessarily. You can continue doing lookups in the most recently verified epoch.

I'm not sure if the client can trust the time_to_next_epoch alone to set up the timer

It's just a heuristic to help clients stay in sync. Trust will come with auditing.

vqhuy commented

You can continue doing lookups in the most recently verified epoch.

The problem happens when the users change their keys frequently.

It's just a heuristic to help clients stay in sync. Trust will come with auditing.

But then how can the client report when the attack happens, since there is no proof of the issued time_to_next_epoch. Besides, let's assume that clientA detects the attack, so it blows the whistle. How can clientB, who sees the whistle-blow, be sure that the server truly sent an invalid time_to_next_epoch, as opposed to clientA just acting as a bad actor?

(Actually, I have proposed to use time_to_next_epoch once in our email discussion, these above ideas are come from Marcela).

It's just a heuristic to help clients stay in sync. Trust will come with auditing.

But then how can the client report when the attack happens, since there is no proof of the issued time_to_next_epoch. Besides, let's assume that clientA detects the attack, so it blows the whistle. How can clientB, who sees the whistle-blow, be sure that the server truly sent an invalid time_to_next_epoch, as opposed to clientA just acting as a bad actor?

EDIT: To clarify, @arlolra, are you suggesting that the server include a timestamp in the STR, or not? If not, I agree with Huy here.

An abstract time_to_next_epoch alone doesn't bind the server to a particular point in time, and auditing wouldn't provide sufficient evidence of an attack. Put differently, even if the most recent epoch for a client is t but auditors return t+5, to an outsider, a whistle-blow from the client only indicates that the client is behind on its epoch, not that the provider equivocated by intentionally delaying the epochs for the client.

OTOH, I think that an epoch interval together with an actual timestamp are a better heuristic for clients. Then, if a client sees epoch = 5, time_to_next_epoch = 1hr, expected_timestamp = Apr 11 2017 17:00UTC and it receives and STR from an auditor with epoch = 7, time_to_next_epoch = 1hr, expected_timestamp = Apr 11 2017 17:00UTC there is less ambiguity as to what happened since the server set the timestamp. (Of course, the server could set a later expected_timestamp for the client, but an attack should still become evident during auditing because the timestamps would differ). The client can still use the time_to_next_epoch as a heuristic for when to run the monitoring protocol next, but it's not enough to raise suspicions about misbehavior.

are you suggesting that the server include a timestamp in the STR, or not?

Yes, the timestamp would still be in the STR.

We could add a third condition that the interval returned be strictly monotonically decreasing, until the next epoch. But, let's not get too bogged down on this heuristic thing. I mentioned it only as a helpful way to keep the client closer to the actual deadline.

Ok, to recap, so far we've identified that,

  1. If you want the freshest possible keys, you're blocked on monitoring at the end of each epoch, but that you can continue looking up in the outdated epoch if that's undesirable;
  2. And, that we probably want some sort of storage for TBs, so they can always be returned at lookup, even if after they've been added in the next epoch.

Keep it coming :)

vqhuy commented

I'm not sure if the client can trust the time_to_next_epoch alone to set up the timer, since a malicious server can always return the same valid time_to_next_epoch which causes the timer run forever?

This doesn't make sense, since the client should trust the time_to_next_epoch only once for the expected STR. If the server keeps returning the timer instead of the STR, I think it is denial of service.

vqhuy commented

Here are something I have in mind towards this issue:

  • Make registration happen in epoch (If the registration request is out of date, return an error)
  • Make lookup happen in epoch (i.e., remove KeyLookupRequest part)