getsentry/sentry-php

Scope::setUser merges differently than before

ruudk opened this issue · 6 comments

ruudk commented

On 2.5.1 I was using this code:

    private function trackUserId() : void
    {
        if (null === $token = $this->tokenStorage->getToken()) {
            return;
        }

        $user = $token->getUser();
        if (is_object($user) === false) {
            // e.g. anonymous authentication

            return;
        }

        $this->sentry->configureScope(
            fn (Scope $scope) => $scope->setUser(['id' => (string) $user->getId()], true)
        );
    }

    private function trackClientIp(Request $request) : void
    {
        $ip = $request->getClientIp();

        $this->sentry->configureScope(
            fn (Scope $scope) => $scope->setUser(['ip' => $ip], true)
        );
    }

and it worked as expected. The User in the scope had the id and ip set.

When upgrading to 3.1.2 I noticed the BC Break:

[BC BREAK] The Scope::setUser() method now always merges the given data with the existing one instead of replacing it as a whole (#1047)

Fine, because I was already using the $merge = true argument anyway.

But surprise, it doesn't work as expected.

This is the new setUser method:

public function setUser($user): self
{
if (!\is_array($user) && !$user instanceof UserDataBag) {
throw new \TypeError(sprintf('The $user argument must be either an array or an instance of the "%s" class. Got: "%s".', UserDataBag::class, get_debug_type($user)));
}
if (\is_array($user)) {
$user = UserDataBag::createFromArray($user);
}
if (null === $this->user) {
$this->user = $user;
} else {
$this->user = $this->user->merge($user);
}
return $this;
}

it converts the input array to an UserDataBag and then calls the merge method on the UserDataBag.

So on the first call to setUser, it would create UserDataBag::createFromArray(['id' => 123]); and store it.

On the second call to setUser it creates UserDataBag::createFromArray(['ip' => '12.23.23.23']); and then notices it already has a UserDataBag and effectively calls UserDataBag::createFromArray(['id' => 123])->merge(UserDataBag::createFromArray(['ip' => '12.23.23.23']);

But the merge method will overwrite everything, even when it's null:

public function merge(self $other): self
{
$this->id = $other->id;
$this->email = $other->email;
$this->ipAddress = $other->ipAddress;
$this->username = $other->username;
$this->metadata = array_merge($this->metadata, $other->metadata);
return $this;
}

This was the old merge method in 2.5.1:

public function merge(array $data, bool $recursive = false): void
{
$this->data = $recursive ? array_merge_recursive($this->data, $data) : array_merge($this->data, $data);
}

I don't know how to proceed now. As there is no way to retrieve the current user from the scope.

EDIT: For now I will refactor my code to only call setUser once, with both id and ip. But I think this is something that could also bite others.

ruudk commented

I guess I found another BC break: ip is no longer working, it's should now be ip_address.

I don't know why the deprecation message said that the new data from 3.0 onwards would have been merged, but looking at the rest of the SDK all of them just replace the existing data as a whole (no merge of any kind, just a plain setter). It's late for us to change this, but maybe @bruno-garcia can tell us what would be the correct behaviour so that, if needed, we can align our codebase with the other SDKs in 4.0

I'd say merging is a fair approach. This isn't a unified behavior though, I guess it can depend on the context. Sometimes makes sense to override the whole thing, others to just augment.

Here the SDK tries to add the data it's aware, but doesn't override the whole user object (potentially dropping other data set to the event before it got there):

https://github.com/getsentry/sentry-dotnet/blob/ed48af8003ad9193ed074d92e01f5681d47cc118/src/Sentry.AspNet/Internal/SystemWebRequestEventProcessor.cs#L72-L94

Here it returns a complete User object: https://github.com/getsentry/sentry-dotnet/blob/04214d80af1c2be70450cc0c833828bc24ba164e/src/Sentry.AspNetCore/DefaultUserFactory.cs
https://github.com/getsentry/sentry-dotnet/blob/8bc640f1c5873d9acc10d8a1b05c4ee58e2d4fc2/src/Sentry.AspNetCore/ScopeExtensions.cs#L51-L54

As part of the contract (user factory, returns a whole user).

This isn't a unified behavior though

According to the documentation, it is, even though all SDKs I checked implements this method wrongly (they all simply overwrite the existing value with the new one):

scope.set_user(user): Shallow merges user configuration (email, username, …). Removing user data is SDK-defined, either with a remove_user function or by passing nothing as data.

And it makes sense to me that no merge is done, because the method is called setUser and thus it should be a plain setter. However, I tracked back where the original deprecation was added (#931) and I understand the needs for merging. Anyway, while before the user data was stored in an array and it was simple to do the merge because only existing new keys overwrote the old ones, now that we use an object it's more difficult to know if a property must be overwritten or not: all properties always exist and null may be the new value that you want to set or it may mean that you don't want to consider that property during the merge. How do we discern that?

Yeah setUser merging sounds like the wrong guideline. Docs need to be amended.

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀