segmentio/analytics-python

AssertionError when user_id=0 and anonymous_id=None

twhyte opened this issue · 4 comments

Occasionally we need to send an event where user_id=0 and anonymous_id=None. Attempting to do so produces the following error:

AssertionError: user_id or anonymous_id must have (<class 'numbers.Number'>, (<class 'str'>,)), got: None

This error appears to originate in the type check made here: https://github.com/segmentio/analytics-python/blob/master/segment/analytics/client.py#L101
user_id or anonymous_id evaluates to None when user_id is equal to 0, since or returns the last object in the expression if all of the objects within the expression are falsey.

@twhyte Hi, do you mind sharing with me your track and/or identify object so I can test?

Thank you.

As per the documentation, field user_id is required and anonymous_id is required if there is not a user_id. I am unable to replicate your results unless I do not have any entry in the user_id or anonymous_id field. Even "0" and "None" register a valid response in the debugger. It is not possible to pass in an identify object without a reference in the user_id field. "0" and "None" are valid but must be present and cannot be omitted from the identify object. I will be happy to work this issue with you further if you can verify with me that the identify object you are sending to segment servers has all of the required fields present.

Thank you.

@nd4p90x Thanks for checking into this for us! We are able to reproduce this issue in the context of a unit test. For context, we're using analytics-python==1.4.0 together with the Segment Mock API container for testing purposes.

Here is the test payload we are working with for a track event:

{'properties': {'cookie': 'oreo'}, 'event': 'test.event', 'context': {'traits': 'sample'}, 'timestamp': datetime.datetime(2021, 4, 1, 14, 46, 16, tzinfo=datetime.timezone(datetime.timedelta(0), '+0000')), 'user_id': 0}

We're using the following code to submit the event to Segment:

def send_track(client, *args, **kwargs):
    was_enqueued, _ = client.track(*args, **kwargs)
    if not was_enqueued:
        raise SegmentClientException("Client queue is full.")

ipdb gives this output for args within the client track() call

ipdb> s
> /usr/local/lib/python3.8/site-packages/analytics/client.py(125)track()
    124               message_id=None):
--> 125         properties = properties or {}
    126         context = context or {}

ipdb> a
self = <analytics.client.Client object at 0x7fe344e24820>
user_id = 0
event = 'test.event'
properties = {'cookie': 'oreo'}
context = {'traits': 'sample'}
timestamp = datetime.datetime(2021, 4, 1, 14, 46, 16, tzinfo=datetime.timezone(datetime.timedelta(0), '+0000'))
anonymous_id = None
integrations = None
message_id = None

The test fails with the following exception:

FAIL: test_sends_track_to_segment_with_zero_user_id (tests.app.worker.test_processors.SegmentTrackProcessorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/tests/app/worker/test_processors.py", line 124, in test_sends_track_to_segment_with_zero_user_id
    processor.process_message(json.dumps(event))
  File "/opt/app/worker/processors.py", line 37, in process_message
    self.send(validated)
  File "/opt/app/worker/decorators.py", line 10, in catch_segment_client_errors
    return func(*args, **kwargs)
  File "/opt/app/worker/processors.py", line 71, in send
    self.client.track(**message)
  File "/opt/app/utils/segment.py", line 65, in track
    send_track(self.client, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/backoff/_sync.py", line 94, in retry
    ret = target(*args, **kwargs)
  File "/opt/app/utils/segment.py", line 61, in send_track
    was_enqueued, _ = client.track(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/analytics/client.py", line 125, in track
    properties = properties or {}
  File "/usr/local/lib/python3.8/site-packages/analytics/client.py", line 325, in require
    raise AssertionError(msg)
AssertionError: user_id or anonymous_id must have (<class 'numbers.Number'>, (<class 'str'>,)), got: None

We see the same error with a corresponding unit test for identify events using the following payload:

{'user_id': 0, 'traits': {'name': 'Cookie Monster'}, 'timestamp': datetime.datetime(2021, 4, 1, 14, 46, 16, tzinfo=datetime.timezone(datetime.timedelta(0), '+0000'))}

@twhyte Thank you for that.

I was able to re-create your issue, however, due to the SPEC requirements, at least one of the fields must be present and must have a valid number or string assigned to the datapoint. The only way around this would be to assign strings to your variables when zero or none is valid for you. At this time we are not planning on modifying the existing requirements.