Azure/AppConfiguration-DotnetProvider

Invalid Feature Flag Value (JSON) causes the application to crash

Closed this issue ยท 7 comments

For context, I am working on understanding Feature Flags in Azure App Config, and using the Azure SDK, I created feature flags using code as follows:

FeatureFlagConfigurationSetting featureFlagConfigurationSetting = new FeatureFlagConfigurationSetting(featureFlagName, isEnabled: false);

using (StreamReader value = File.OpenText($@"FeatureFlagDefinitions\{featureFlagName}.json"))
{
  featureFlagConfigurationSetting.Value = value.ReadToEnd();
}

ConfigurationSetting response = await ConfigurationClient.SetConfigurationSettingAsync(featureFlagConfigurationSetting);

While this code works for valid json files, I was testing what happens when the json file contains invalid value, e.g. consider Invalid.json as follows

"not valid json"

which is clearly not a valid json file.

This results in an invalid feature flag being created in the system, which, when consumer client code tries to fetch any of this, it will crash.

Here is the relevant client code snippet using AddAzureAppConfiguration

IConfigurationRoot configuration = new ConfigurationBuilder()
                    .AddAzureAppConfiguration(options =>
                    {
                        options.Connect(AzureConnection.GetAppConfigurationReadOnlyConnectionString()) 
                            .UseFeatureFlags(featureFlagOptions =>
                            {
                                featureFlagOptions.CacheExpirationInterval = TimeSpan.FromDays(1); 
                            });

                       _featureFlagConsumer._refresher = options.GetRefresher();
                    }, true).Build();

And here is the exception thrown (important to note that the optional flag is set to true so exceptions should be suppressed), See this thread: #136

JsonException: The JSON value could not be converted to Microsoft.Extensions.Configuration.AzureAppConfiguration.FeatureManagement.FeatureFlag. Path: $ | LineNumber: 0 | BytePositionInLine: 19.

It would be ideal if the flag is not created in the first place, and rejected at the SDK level, as the above makes it possible to crash an app at runtime as soon as you add a new invalid flag in place.

Additional Information/Questions looking for clarification on:
1. I tried to use FeatureFlagOptions.Select as follows, to only fetch specific flags (so that but it doesn't seem to work?

options.Connect(AzureConnection.GetAppConfigurationReadOnlyConnectionString()) // consumer only needs read access
                            .UseFeatureFlags(featureFlagOptions =>
                            {
                                featureFlagOptions.CacheExpirationInterval = TimeSpan.FromDays(1); 
                                featureFlagOptions.Select("Beta", LabelFilter.Null);
                            });

The code above should only fetch one flag (if at all): "Beta" and without labels (there is no wildcard in Beta, so I'm assuming that means it matches exactly), but when calling IFeatureManager.GetFeatureNamesAsync() it lists all the feature flags in the system ("Beta", "Gamma", etc..), effectively ignoring the Select() clause. I tried "Beta*" and adding a label filter (also adding that label to the FF itself in Azure Portal) but to the same effect of IFeatureManager.GetFeatureNamesAsync() listing all the feature flags in the system.

See this thread for reference: #234

This is purely for if same keys have multiple labels (another dimension). Question can be ignored.

  1. How does Feature Flag caching work?
    In particular, if I use a push mechanism, with service bus code as follows:
serviceBusClient.RegisterMessageHandler(
                handler: (message, cancellationToken) =>
                {
                    messagesRecieved += 1;

                    
                    // Build EventGridEvent from notification message
                    EventGridEvent eventGridEvent = EventGridEvent.Parse(BinaryData.FromBytes(message.Body));

                    bool hasSucceededNotification = eventGridEvent.TryCreatePushNotification(out PushNotification pushNotification);

                    if (hasSucceededNotification)
                    {
                        refresher.ProcessPushNotification(pushNotification, maxDelay: TimeSpan.FromSeconds(10));
                            
                        PrintSuccessfulEventHandlingDetails(eventGridEvent, pushNotification);
                    }
                    else
                    {
                        messagedFailed += 1;
                        PrintUnsuccessfulEventHandlingDetails(eventGridEvent);
                    }

                    Console.WriteLine($"{messagesRecieved} messages have been received. {messagedFailed} messages have failed.\n");

                    return Task.CompletedTask;
                },
                exceptionReceivedHandler: (exceptionargs) =>
                {
                    Console.WriteLine($"Exception: {exceptionargs.Exception}\n");
                    return Task.CompletedTask;
                });

If I update 1 Feature flag from false to true, it will generate one KeyValueModified event, and let's say there are 10 elements in the cache. Does this event (refresher.ProcessPushNotification(...)); only invalidate the one specific entry in the cache, or does it refresh all the values?
I know for app config itself, there is example code as follows:

.ConfigureRefresh(refresh =>
                                {
                                    refresh.Register("SentinelKey", refreshAll: true)
                                });

but this does not apply to feature flags? (or if it is, how, since the client code above does not have any keys registered? wondering what the default behavior is, and if there is a way to override it). My current "guess" is that a call to refresher.ProcessPushNotification(...) will smartly handle cache invalidation based on the details of the push notification itself, and so basically invalidate one specific entry in the cache if its an update/delete, or if its a brand new entry, it will create space for it in the cache.

  1. In the line bool hasSucceededNotification = eventGridEvent.TryCreatePushNotification(out PushNotification pushNotification); this will fail (i.e. return False) if you delete an already deleted KeyValue pair, is this by design? To clarify:
    Step 1: Create() <-- generates an event to invalidate cache, that will be processed with hasSucceededNotification = true
    Step 2: Delete() <-- generates an event to invalidate cache, that will be processed with hasSucceededNotification = true
    Step 3: Delete() <-- generates an event to invalidate cache, that will be processed with hasSucceededNotification = false
    Question: is the result of step 3 by design? Edit: I am assuming it is

3a: Follow-up question to (Q3), if Select() worked (See Q1), would creating new feature flags (if they don't belong in the filters), ever touch the cache? I am aware that the event will be generated and the message will be handled, but wondering if refresher.ProcessPushNotification(pushNotification, maxDelay: TimeSpan.FromSeconds(10)); is essentially a no-op in that case.
Ignore 3a, Q1 has been crossed out.

4. Perhaps this not in the domain of AppConfig, but in the message handler (See Q2), why is the exceptionRecievedHandler not invoked when the handler code throws an exception? My initial thought was that for any exception thrown, (for example, if I call refresher.ProcessPushNotification(..) when the notification creation has not succeeded, it will throw and crash the app, instead of calling the exceptionRecievedHandler. Is this by design? If so, when would exceptionRecievedHandler ever be called?
Ignore 4. Updated service bus package and this is no longer a problem. ProcessErrorAsync event handler is called as expected.

Thanks Azure App Config Team!

Happy to clarify any of the above information

Rohan

Hi @RohanRao27, thanks for reporting this issue! If optional is set to true, there shouldn't be any exceptions thrown for invalid feature flags. We are working on fixing this bug.
About the feature flag caching mechanism - we will refresh all feature flags loaded by your application even if only one KeyValueModified event is received.
I didn't understand the third question. What do you mean by "deleting an already deleted KeyValuePair"?

Hi @avanigupta thanks for getting back to me with the updates, this is good to know!

The third question was minor, but it was basically a question about calling the App Config SDK and deleting a feature flag that was previously deleted (this scenario is not possible through the console).

Where I can use the SDK, and delete a feature flag, but what happens if I call that delete function on an "already deleted" (i.e. non-existent) flag. This scenario can occur if using a script to clean flags with hard coded values (complete edge case I know).

But since the eventGridEvent.TryCreatePushNotification(..) is returning false, that is good for me since it won't generate a needless network call to update the configuration when a script runs elsewhere in this case - can consider this question resolved!

Great! For future reference, deleting a key-value that doesn't exist in the AppConfig store will not trigger any Event Grid event - since the state of your AppConfig store did not change at all.

Interesting you suggest that - If I try deleting a non-existent flag (whether it was created then deleted or never created but deleted), it does generate the event grid event, but the notification returns false - perhaps this behavior is not intended then? Just retested it today.

Here is an example event grid event that is generated on a flag that was never in the store, to begin with, but using the following code to delete it:

 FeatureFlagConfigurationSetting featureFlagConfigurationSetting
                = new FeatureFlagConfigurationSetting(featureFlagName, isEnabled: false, label: null);
            await ConfigurationClient.DeleteConfigurationSettingAsync(featureFlagConfigurationSetting);

Here is an example event grid event received on the consumption side:

EventType: Microsoft.AppConfiguration.KeyValueDeleted
EventId: 4c82b4ed-7566-45cd-8807-7195227fbe63
Subject: https://poc348814.azconfig.io/kv/.appconfig.featureflag%2FBeta902q3yt9pqj23t;IKBEGT?api-version=1.0
Topic: /subscriptions/088f469f-ce57-45d1-ba6a-9a57b0ed1c37/resourceGroups/pcl-dev-ind-randd-rg/providers/microsoft.appconfiguration/configurationstores/poc348814

I had 2 threads running, one to repeatedly delete 1 flag that was never there, and another to monitor the event bus (each delete keeps generating event grid events, and returns false when it tries to create a push notification)

This is not the intentional behavior. Thanks for catching this! We'll make sure that deleting a key-value that doesn't exist in AppConfig will not trigger Event Grid events.

Thanks Avani!

Hi @RohanRao27 I just wanted to see if you could help with identifying the exact JsonException you got here. Was this under a higher level FormatException?