Allow CloudPubSubSource to use existing subscription instead of creating one
mattwelke opened this issue · 2 comments
Problem
When I read the docs example on CloudPubSubSource, I saw that it used topic
as input (https://github.com/google/knative-gcp/blob/master/docs/examples/cloudpubsubsource/cloudpubsubsource.yaml). This means that it must create a subscription to the topic to use. This prevents the Knative user from being in control of the lifecycle of the subscription. For some use cases, this is necessary:
Preventing Data Loss:
Subscriptions can exist without subscribers (applications consuming messages) existing, in which case the subscription holds on to data (for up to 7 days) until subscribers are healthy and can pull the messages to process them. If the cluster creates the subscription, it might end up also deleting the subscription (for example, if the source is deleted). This would result in data loss if the publishing application doesn't stop publishing messages until a new subscription is created.
Pub/Sub is meant to decouple applications, and one of Knative's design is goals (https://knative.dev/docs/eventing/) is so that "Event producers and event consumers are independent. Any producer (or source), can generate events before there are active event consumers that are listening.". Therefore, having to stop the Pub/Sub publishing application from publishing messages before making changes to the subscribers (for example, re-creating a Kubernetes cluster with Knative services) breaks this design goal.
Load Balancing:
Pub/Sub supports load balancing by having multiple subscribers pulling from one subscription. The existing pattern with Knative removes the need for this, because Knative can act as a load balancer, distributing messages to multiple containers for a Knative service sink. However, the existing pattern with Knative does not support geographic load balancing. For high available, people might create two Kubernetes clusters in separate data centers and have subscribers for one subscription deployed to each of these clusters. Since this Knative source would create a separate subscription in each cluster, this would result in the data being duplicated instead of load balanced.
Event consumer (developer)
The developer would be making an app that they want to consume Pub/Sub messages, where the type of app they're making is one where each message is important and shouldn't be discarded when the app isn't running, or during operations like replacing the cluster with a new cluster. The developer might be working in a large organization where it's important for each team to work independently, and so it wouldn't be feasible for them to ask those in charge of the publishing application to shut down that application.
Exit Criteria
- When the source is created, and a subscription name is provided as input, the source uses that subscription as it connects to GCP instead of creating a new subscription to a topic.
Time Estimate (optional):
unknown right now
Additional context (optional)
This would allow Knative to be used in use cases where a Kubernetes deployment is set up with an app that connects to Pub/Sub using a subscription name instead of a topic name.
I confirmed that this is how CloudPubSubSource behaves right now by running its example in the docs. I ran into an issue with my first cluster because its nodes were too small to schedule the receive-adapter pod CloudPubSubSource tells the cluster to start. When I had recreated my cluster, I ended up with two subscriptions, since the new cluster had no idea about the old one:
The unacked message in the first subscription, without me running a manual program to forward them to the new topic, showed the message stuck (ignore the second message stuck, that's because I ran the publish command a second time after making the subscription):
In the real world, I don't think a manual effort to forward the messages would work well, because the topic may be associated with multiple downstream systems, and publishing a message to the topic again may result in problems if another downstream system didn't lose access to its subscription and already processed the message.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.