protocolbuffers/protobuf

AttributeError: 'NoneType' object has no attribute 'message_types_by_name'

oldnapalm opened this issue ยท 35 comments

Version: 4.21.1
Language: Python
Windows 10
protoc-21.1-win64

Using version 3.20.1 it works, but with 4.21.1 I get AttributeError: 'NoneType' object has no attribute 'message_types_by_name'
https://github.com/oldnapalm/zwift-offline/blob/1b3e9d16e903b452d37a77675a38b402abb1e431/protobuf/per_session_info_pb2.py#L22

One difference from this message to others that work is that it has only a repeated field
https://github.com/oldnapalm/zwift-offline/blob/1b3e9d16e903b452d37a77675a38b402abb1e431/protobuf/per-session-info.proto#L9

I saw this on one of my projects too. In my case, the common factor seems to be that the code ended up importing this particular module (using importlib) twice. The first time it worked, but the second time it died with that error. I was able to resolve the issue by having it cache the module after importing it the first time, and that fixed it.

A Linux user reported another AttributeError with version 4.21.1 while with 3.20.1 it works.

https://github.com/oldnapalm/zwift-offline/blob/1b3e9d16e903b452d37a77675a38b402abb1e431/zwift_offline.py#L2530

AttributeError: 'EnumTypeWrapper' object has no attribute 'RealmID'

https://github.com/oldnapalm/zwift-offline/blob/1b3e9d16e903b452d37a77675a38b402abb1e431/protobuf/udp-node-msgs.proto#L5

Turns out this issue is also happening in a different project here at Relativity, one which does not use importlib and instead just uses a certain arrangement of import statements in a pytest suite. This one is not workaroundable and we're having to just pin protobuf to 3.x due to this issue.

However, I was able to set up a minimal example of the issue using importlib. This is able to reproduce reliably on Ubuntu 20 with protoc 3.21 and protobuf python version 4.21.1.

First create foo.proto:

syntax = "proto2";

message FooMessage
{
	repeated int32 someInt = 1;
}

Then create test.py:

import importlib

for counter in range(0, 2):
	imported_process = importlib.util.spec_from_file_location("foo_pb2", "foo_pb2.py")
	module = importlib.util.module_from_spec(imported_process)
	imported_process.loader.exec_module(module)

Then compile and run it:

$ protoc --python_out=. foo.proto
$ python3 test.py

This gets:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    imported_process.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/jamie/dev/protobuf-4-bug-repro/foo_pb2.py", line 18, in <module>
    _builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals())
  File "/home/jamie/.local/lib/python3.8/site-packages/google/protobuf/internal/builder.py", line 64, in BuildMessageAndEnumDescriptors
    for (name, msg_des) in file_des.message_types_by_name.items():
AttributeError: 'NoneType' object has no attribute 'message_types_by_name'

We are running similar issue on Windows. Basically we generated the python files using protobuf 3.19.4 and the using python protobuf package 4.21.1 result into following errors:

....
  | E                             File "c:\install\ray\python\ray\core\generated\runtime_env_common_pb2.py", line 21, in <module>
  | E                               _PIPRUNTIMEENV = DESCRIPTOR.message_types_by_name['PipRuntimeEnv']
  | E                           AttributeError: 'NoneType' object has no attribute 'message_types_by_name'


In my case the problem was caused by adding a module directory to sys path and using import directory.module. Using import module instead, it works with 4.21.1 without issue.

Upgrading protobuf fixed this issue for me
pip install --upgrade protobuf

The example I posted above is still broken with protobuf 4.21.2

HiceS commented

I'm experiencing the same issue

using protoc version 3.21.2 from homebrew

Tried using pip protobuf packages 4.21.2 and 4.21.1

I have not had any success using different package import resolution methods.

File "/Users/hices/Library/Python/3.9/lib/python/site-packages/google/protobuf/internal/builder.py", line 64, in BuildMessageAndEnumDescriptors
    AttributeError: 'NoneType' object has no attribute 'message_types_by_name'

Could be because im still running Python 3.9 - seems like there is a soft suggestion that python 3.9 has a specific protoc version linked to it. If anyone knows let me know. There is a brew protobuf target for python3.9. protobuf brew

An update on this since I ran into this issue while integrating the new Protobuf version with gRPC. The issue seems to be that AddSerializedFile will return None on subsequent invocations for the same file and same symbol DB. When None gets passed to BuildMessageAndEnumDescriptors, it has this error.

Since the symbol DB is a process-level singleton, this means that the module-level code of a _pb2.py file is executed multiple times within a single process, you'll encounter this error. Normally, this should be fine since modules are cached and subsequent imports of the same _pb2 will not result in its module-level code being imported.

The issue in our repo was that there was a latent bug in our test runner that imported each _pb2 multiple times under different names. By ensuring that each _pb2 was only imported once under a single module name, this problem was avoided entirely.

An update on this since I ran into this issue while integrating the new Protobuf version with gRPC. The issue seems to be that AddSerializedFile will return None on subsequent invocations for the same file and same symbol DB. When None gets passed to BuildMessageAndEnumDescriptors, it has this error.

Since the symbol DB is a process-level singleton, this means that the module-level code of a _pb2.py file is executed multiple times within a single process, you'll encounter this error. Normally, this should be fine since modules are cached and subsequent imports of the same _pb2 will not result in its module-level code being imported.

The issue in our repo was that there was a latent bug in our test runner that imported each _pb2 multiple times under different names. By ensuring that each _pb2 was only imported once under a single module name, this problem was avoided entirely.

how to do this

An update on this since I ran into this issue while integrating the new Protobuf version with gRPC. The issue seems to be that AddSerializedFile will return None on subsequent invocations for the same file and same symbol DB. When None gets passed to BuildMessageAndEnumDescriptors, it has this error.

Since the symbol DB is a process-level singleton, this means that the module-level code of a _pb2.py file is executed multiple times within a single process, you'll encounter this error. Normally, this should be fine since modules are cached and subsequent imports of the same _pb2 will not result in its module-level code being imported.

The issue in our repo was that there was a latent bug in our test runner that imported each _pb2 multiple times under different names. By ensuring that each _pb2 was only imported once under a single module name, this problem was avoided entirely.

can you explain or provide the changed files

Hey we have run into the similar issue on Ubuntu 16.04 running Python 3.7 and this has been a problem for some time. I had no success using different packages or any other suggested solution.
What we want to do is just to access the google.cloud.pubsub_v1.types and we got the error message when we only run the tests for all the import rules.
More details are noted in this ticket: bazelbuild/rules_python#783.

from google.cloud import pubsub_v1
  File "/home/me/.cache/bazel/_bazel/xx/sandbox/processwrapper-sandbox/3/execroot/bazel-out/k8-fastbuild/bin/binary_smoke_test.runfiles/deps/pypi__google_cloud_pubsub/google/cloud/pubsub_v1/__init__.py", line 17, in <module>
    from google.cloud.pubsub_v1 import types
  File "/home/me/.cache/bazel/_bazel/xx/sandbox/processwrapper-sandbox/3/execroot/bazel-out/k8-fastbuild/bin/binary_smoke_test.runfiles/deps/pypi__google_cloud_pubsub/google/cloud/pubsub_v1/types.py", line 26, in <module>
    from google.api import http_pb2  # type: ignore
  File "/home/me/.cache/bazel/_bazel/xx/sandbox/processwrapper-sandbox/3/execroot/bazel-out/k8-fastbuild/bin/binary_smoke_test.runfiles/deps/pypi__googleapis_common_protos/google/api/http_pb2.py", line 36, in <module>
    _HTTP = DESCRIPTOR.message_types_by_name["Http"]
AttributeError: 'NoneType' object has no attribute 'message_types_by_name'

Any suggestion would be appreciated.

I have the same error as seen below.
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 import gmsh

File ~\AppData\Roaming\Python\Python39\site-packages\gmsh.py:53, in
50 else:
51 libpath = find_library("gmsh")
---> 53 lib = CDLL(libpath)
55 try_numpy = True # set this to False to never use numpy
57 use_numpy = False

File C:\ProgramData\Anaconda3\lib\ctypes_init_.py:364, in CDLL.init(self, name, mode, handle, use_errno, use_last_error, winmode)
362 import nt
363 mode = nt._LOAD_LIBRARY_SEARCH_DEFAULT_DIRS
--> 364 if '/' in name or '\' in name:
365 self._name = nt._getfullpathname(self._name)
366 mode |= nt._LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR

TypeError: argument of type 'NoneType' is not iterable`

help me to overcome this problem.
the first time it works

I was able to fix our issues by upgrading grcpio and adding grcpio-tools to our dependencies. Previously, we were only on grpcio==1.37.1

grpcio==1.47.0
grpcio-tools==1.47.0

Installing just grpcio==1.47.0 caused a separate free() on a null pointer, which pointed me to this solution: googleapis/python-pubsub#414

I wanted to understand what had happened, so I dug into the code. In my case, I was running on an M1 Mac. All of our builds were running perfectly fine under our x86-based infra which was the first clue. Looking into the code on version 1.37.1:

if _USE_C_DESCRIPTORS:
# This metaclass allows to override the behavior of code like
# isinstance(my_descriptor, FieldDescriptor)
# and make it return True when the descriptor is an instance of the extension
# type written in C++.
class DescriptorMetaclass(type):
def __instancecheck__(cls, obj):
if super(DescriptorMetaclass, cls).__instancecheck__(obj):
return True
if isinstance(obj, cls._C_DESCRIPTOR_CLASS):
return True
return False
else:
# The standard metaclass; nothing changes.
DescriptorMetaclass = type

if _USE_C_DESCRIPTORS:
_C_DESCRIPTOR_CLASS = _message.FileDescriptor
def __new__(cls, name, package, options=None,
serialized_options=None, serialized_pb=None,
dependencies=None, public_dependencies=None,
syntax=None, pool=None, create_key=None):
# FileDescriptor() is called from various places, not only from generated
# files, to register dynamic proto files and messages.
# pylint: disable=g-explicit-bool-comparison
if serialized_pb == b'':
# Cpp generated code must be linked in if serialized_pb is ''
try:
return _message.default_pool.FindFileByName(name)
except KeyError:
raise RuntimeError('Please link in cpp generated lib for %s' % (name))
elif serialized_pb:
return _message.default_pool.AddSerializedFile(serialized_pb)
else:
return super(FileDescriptor, cls).__new__(cls)

Note that it attempts to load in extensions. If they aren't present (which they aren't by default under ARM64), you end up with a completely different metaclass for the descriptor.

In our case, we were calling AddSerializedFile. In the pure python implementation, there's no return which led to our NoneType error.

def AddSerializedFile(self, serialized_file_desc_proto):
"""Adds the FileDescriptorProto and its types to this pool.
Args:
serialized_file_desc_proto (bytes): A bytes string, serialization of the
:class:`FileDescriptorProto` to add.
"""
# pylint: disable=g-import-not-at-top
from google.protobuf import descriptor_pb2
file_desc_proto = descriptor_pb2.FileDescriptorProto.FromString(
serialized_file_desc_proto)
self.Add(file_desc_proto)

As MattDietz mentioned, you need to return in AddSerializedFile function. All you have to do is to write

 file_desc =  self._ConvertFileProtoToFileDescriptor(file_desc_proto)
 return file_desc

in AddSerializedFile function in descriptor_pool.py file

Yeah, this is absolutely an issue when hit in a Jupyter notebook. We use GRPC to communicate with services in our notebooks and the pb2 file is loaded twice and there's not much we can do about it.

A temporary work-around is to downgrade your grpcio-tools version to 1.48.1:

pip install -Iv grpcio-tools==1.48.1

I've confirmed that this fixes the issue.

@haberman Can someone on the protobuf team please look into fixing this? While it might not be orthodox to load a single proto into the same process multiple times, it's clearly something that has seeped its way into many people's workflows.

Fix is pending in: protocolbuffers/upb#804

The fix has just been submitted so imma close this.

Which release on pip will have this solved?

@haberman @ericsalo Ditto. What is the planned timeline for the release of this patch to PyPi? Reports of affected users continue to roll into the gRPC repo. Is this in the recent 4.21.7 release?

Doesn't seem to be fixed in 4.21.7. Have just tripped over this setting up a Windows 10 VM with

We are cutting a release shortly that will have the fix. This should land in the next week or so.

@haberman I just upgraded to protobuf (4.21.8) with grpcio & tools (1.50.0) and still getting same error :-(

I'm also still seeing it in 4.21.8

For people who are still seeing the error: are you using pure-Python or the C acceleration (upb)?

What is output if you execute this?

$  python -c 'from google.protobuf.internal import api_implementation; print(api_implementation.Type())'

My theory is that we fixed this for api implementation type upb but it is still broken for api implementation type python.

What is output if you execute this?

$  python -c 'from google.protobuf.internal import api_implementation; print(api_implementation.Type())'

The two python environments that I'm using the protobufs in both report upb.

What is output if you execute this?

$  python -c 'from google.protobuf.internal import api_implementation; print(api_implementation.Type())'

Same here - both my python envs report upb with the latest version and both continue to show the same error

It looks like the fix was not included in the release due to a hiccup in the release process.

Sorry about this -- we will be releasing again within the next week and will assure that the fix goes in.

since this is not yet released, can you please keep this issue open?

Sure, re-opening until the fix is released.

Please don't forget this fix :)

Just wanted to check back in here and see if we're still planning on releasing this fix soon - we're still waiting on this to be fixed so we can update our version and take advantage of the newer features.

I'm hitting it too. Seems to be specific to newer Python 3 versions?

This was fixed in 4.21.9.

Here is a minimal repro, you can verify it no longer occurs in 4.21.9:

# test.py
from google.protobuf import descriptor_pool as _descriptor_pool

desc1 = _descriptor_pool.Default().AddSerializedFile(b'\n\ntest.proto\"\x03\n\x01M')
desc2 = _descriptor_pool.Default().AddSerializedFile(b'\n\ntest.proto\"\x03\n\x01M')

print(desc1)
print(desc2)

assert desc1 is not None
assert desc2 is not None
assert desc1 is desc2

Demo:

$ python -m venv /tmp/venv
$ /tmp/venv/bin/pip install protobuf==4.21.8
[...]
Successfully installed protobuf-4.21.8
$ /tmp/venv/bin/python test.py 
<google._upb._message.FileDescriptor object at 0x7f887c0fbc00>
None
Traceback (most recent call last):
  File "/usr/local/google/home/haberman/test.py", line 10, in <module>
    assert desc2 is not None
AssertionError
$ /tmp/venv/bin/pip install protobuf==4.21.9
[...]
Successfully installed protobuf-4.21.9
$ /tmp/venv/bin/python test.py 
<google._upb._message.FileDescriptor object at 0x7f73b6907c00>
<google._upb._message.FileDescriptor object at 0x7f73b6907c00>