IBM/network-config-analyzer

Lazy evaluation of policies' allowed connections

zivnevo opened this issue · 2 comments

In the new optimized evaluation of allowed connections, a lot of work is done in preprocessing, that is, when building network configs.
This may increase runtime for certain queries, which do not rely on allowed connections (e.g., allCaptured, disjointness).
Moreover, if a network-config is defined but not used, we pay the price for preprocessing its policies.

Suggestions:

  • Calculate only necessary connections out of { all allowed connections, captured allowed connections, denied connections, pass connections } according to query requirements.
  • In the NetworkPolicy level, only calculate allowed-connections when needed. That is, initialize the variables storing allowed connections to None, and only when the allowed_connections() function is called for the first time calculate the required HCS.
  • Possibly, delay parsing a NetworkPolicy to when its allowed connections are actually required.
  • Store computed allowed-connections at the NetworkConfig level. Use two variables for captured connection and all connections. Populate these variables lazily.

Regarding bullet#2 and bullet#3 above (#496):
their implementation must be consistent with #485:
Initially, during the parsing of network policies, the full domain of peers is set to its default (assuming maximum 10000 number of pods). Later, during queries run, the full domain of peers is updated to all peers in the current config (whose number is necessarily smaller than 10000).
Therefore, if we want to make lazy policy parsing (including calculation of optimized_allow_ingress_props, optimized_deny_ingress_props, etc) and lazy evaluation of policy allowed connections, we should make these lazy calculations while peers domains are set to their default (up to 10000 pods), and then the result of these calculations may be stored and reused by all queries. In order to use these stored connections by some query, we should first reduce them according to that query context (i.e., setting peers domains in DimensionsManager to union of all peers from the configs of the query and running _reduce_active_dimensions for the connections that we need for that query).