Verdict is not Thread-Safe ๐ โโ๏ธ
Closed this issue ยท 4 comments
This issue proposes to make Verdict thread-safe.
Error in Shopify/help repo
๐
โNoMethodError in contact#en
undefined method `switch' for nil:NilClass
๐View on Bugsnag
The issue ๐ค
For multi-threaded applications, it's possible to receive several concurrent requests to Verdict as the app boots for the first time.
The Shopify Help Center receives hundreds of requests per minute, especially to our "Contact Support" page: https://help.shopify.com/questions
Each time the app is deployed, there is a small window of time during the deploy when Verdict has not set up its experiments. The first request to Verdict may come at the same time as several others, since the Contact Support page is always receiving requests. These requests will be processed concurrently in multiple threads.
On one thread ๐งต
Verdict['experiment-handle']
is called- this calls the method
#repository
- this checks if the instance of
@repository
exists, which it does not - so Verdict calls the
#discovery
method and sets the@repository
instance variable to an empty hash
Verdict then goes through the process of adding all the experiments to the @repository
hash.
On a second thread ๐งต
Verdict[experiment-handle]
is called- this calls the method
#repository
- this checks if the instance of
@repository
exists which it does now exist - so its skips the
#discovery
method and returns the@repository
which is an empty hash currently
Which it then does not find the existing handle so returns nil
Temporary solution ๐
We added a safe navigation operator to #switch
that allows the code to continue if the call to Verdict['experiment-handle']
returns nil.
Proposed solutions ๐
We have three possible solutions:
1) Eager load experiments
Eagerly call the discovery method as soon as Verdict is initialized that way the repository is ready to go.
2) Use Mutexes
Protect the state of the object with Mutexes so the object's state is shared between all threads but the access is limited to a single thread at once
3) Compartmentalize the State of the Instance
Use a thread-safe structure for local state that compartmentalizes the state per thread.
Would appreciate your input and insights @wvanbergen and @eljojo ๐
- Eager load experiments
That is the best solution. Lazy loading the experiments in production can be really expensive for some requests. Recently we did this in shopify core to avoid a few seconds on checkout.
https://github.com/Shopify/shopify/pull/192561
Are you willing to work on that solution? If you are, feel free to ping me in the review.
Hi! great write up!
I think 1) eager loading experiments is the fastest way of going about it.
@ptolts recently shipped a change in core so we do that, which turned in 20% of checkouts being 200 ms faster IIRC. https://github.com/Shopify/shopify/pull/192561
https://github.com/Shopify/shopify/blob/46d110a85a1543014a83f881c07912ad60f1dae1/config/application.rb#L338-L339
Ideally we'd do 3), at some point I wrote a document proposing how we could do this, but I think it would be a tremendous amount of work.
There's some good bits of context on https://github.com/Shopify/shopify/pull/110923 about why verdict's loading behaviour is so painful.
I've seen the issue before and also ended up doing a nil check ๐ญ.
Core isn't threaded and had the same issue -- there must be something else besides threading that triggers this. I haven't seen the issue in a while (Bugsnag search) โ maybe eager loading from Shopify/shopify#192561 did indeed solve the problem on core.