open-policy-agent/opa

Flag to fail OPA runtime if some of the bundle was not found

Opened this issue · 4 comments

ahd3r commented

What is the underlying problem you're trying to solve?

In our use case, it's very crucial to load all bundles on OPA startup. Whenever error happens, we need to handle it the way we want, and not just see it in logs.

Describe the ideal solution

In OPA runtime golang lib, when we start a server, I want to panic, if some of my bundles fail to load.

Now I see error log like this:

{"level":"error","msg":"Bundle load failed: server replied with Not Found","name":"unsigned_policy_bundle","plugin":"bundle","time":"2024-04-23T18:56:41Z"}

But I want lib to panic, so I could catch this panic and write my logic of how to handle this error.

Describe a "Good Enough" solution

Panic if bundle load fails for whatever reason.

In our use case, it's very crucial to load all bundles on OPA startup. Whenever error happens, we need to handle it the way we want, and not just see it in logs.

How are you deploying OPA? For example, if you use Kubernetes you can configure a readinessProbe that calls /health?bundles. This will ensure that OPA serves requests only after all bundles are activated. Also whenever bundle activation fails OPA will log the error and also send a status message with the error via the Status API . This provides visibility into OPA's status.

Panic if bundle load fails for whatever reason

OPA can be loaded with one or more bundles. So to panic if one of them fails activation does not seem appropriate. The client can use the Health API or Status API for example to get more info and behave accordingly.

ahd3r commented

It might work, but I also noticed /health?bundles endpoint returns a general error text when one of the bundles fails to load. It would be nice to have the name of the bundle that failed to load in the error msg.

That's probably something that can be improved.

stale commented

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.