opengdpr/OpenDSR

Shouldn't requests to processors be signed?

pdehaye opened this issue · 9 comments

bhox commented

Hey @pdehaye ,
When making a new request the controller will have to use predefined authentication so the processor will already know who is submitting the requests.

The question is not to authenticate the author, but to certify who the author is, that the author made the request, etc. This seems relevant for accountability on the processor side, especially if other processors might host the openGDPR process on behalf of controller (i.e. presumably that processor's authentication would be relevant to transmit the controller's signature).

bhox commented

When you say Author do you mean data subject or Controller? If another processor is passing request to another processor, then they'd have to use the controllers auth and would appear as the controller to the final processor.

I meant data controller, but indeed my statement was very unclear.

Your response brings a lot more questions in my mind.

From my understanding of the system you are building, you want to be able to chain requests

controller --A--> processor 1 --B--> processor 2

(where sometimes the openGDPR communication over A goes through an intermediary processor 0, but let's ignore that for now)

So, what information gets passed over A (and how is it secured), and what information passes over B?

I would expect A to be secured with some authentication key for processor 1 to be sure that it is the controller making the request. But how is processor 1 capable of proving that the controller made the request? I would therefore additionally expect some signature on the request.

Maybe it is clearer when looking at B: what gets transmitted from processor 1 to processor 2? Shouldn't it be processor 1 getting authenticated? Shouldn't it be processor 1 signing that request, so processor 2 can demonstrate he got such a request?

Much of this is described in the raw long form spec, but Processor 1 would use controller A's credentials to make the call on their behalf to processor 2. This is the core of what a platform such as mParticle does. This use case likely does not come up for most companies, but the spec must be able to accomodate a variety of cases.

From a security perspective, the flow you mention above, Processor 1 is transparent to processor 2.

Ah, now I can see better where the communication problem is.

I suppose in situations where processor 1 is mParticle, processor 2 is processor to the controller directly.

But there are also situations where processor 2 is processor to processor 1, and processor 2 is not even known to the controller (expect maybe that information is slipped into a contract). Imagine for instance the whole of the adtech ecosystem. This is why I think there is a need to decouple the authentication for the channel communication, and the signature to authorize the request itself. Certainly if I was a controller I wouldn't want my authentication credentials to be passed around to a big array of processors I have no knowledge of.

I can’t say we have contemplated a situation where processor 2 is not known to the controller. Furthermore, the gdpr regulation does not permit this. This is why processor 1 has to act on behalf and with full trust of the controller.

If you would like to submit a PR, the broader working group would be able to discuss and consider it.

I really appreciate the questions and push to shape this spec!

The GDPR permits this: very few processors will tell their clients how their infrastructure is hosted (and will not notify the controllers if they change hosting providers). These cloud providers are themselves processors. This situation is not really relevant to openGDPR, but it shows the possibility is there to chain processors without disclosing to the controller the whole chain.

And there are lots of processors who would use this possibility in context more relevant to data subject rights exercising their rights (I can give some examples, but it should be clear from the complexity of the adtech ecosystem that it will be desirable to keep some processor chains longer than one)

You say:

processor 1 has to act on behalf and with full trust of the controller

I agree, but in the absence of technical measures going in the direction of data minimisation, there should still be contractual measures outlining the trust assumed by the controller from the processor ("I will pass you too much information for data subjects to exercise their rights, but you can only use it for this purpose" -- this would for instance exclude further matching of ids by downstream processors if the openGDPR request is based on an identifier list too broad).

I think all this means openGDPR could go on as is, but it has shortcomings that would need to be addressed by contract rather than technological means. This would seem like a shortcoming. One way to address it would be to decouple the legal definition (controller/processor) from the roles played in openGDPR (master/slave or something like this). A starting point could be for me to do a partial PR, introducing these two additional definitions and changing the core of the spec. This would be incomplete but would fix a bit what I am suggesting for the working group to discuss. Ok?

bhox commented

@pdehaye Agreed, part of contractual agreements between processors and controllers should obligate the controllers to correctly honor received gdpr data subject requests, including through any subprocessors in use.