-#from google.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2 # works
-from google_test.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2 # broken
+from google.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2
It looks like you're trying to insert your own descriptor into generated code? I'm kind of surprised this ever worked.. What are you trying to do?
@mkruskal-google yes, we're inserting our own descriptor into the generated code. My understanding is that this is something that's necessary for our confluent kafka schema registry lookups to work correctly.
But the base issue is that this works fine in the python implementation and the cpp implementation on 3.20.3, and it works fine on the python implementation on 4.21.12 as well. It's only the upb implementation that it fails on.
Yes, I'm aware.
The problem is that different implementations other than upb do not have an issue with this. I'm also trying to see if we can get rid of this internally, but for years this has worked fine for both cpp and python implementations and only with the 4.21 release has it broken with the switch over to upb as the default.
Here's the internal user story:
"I want to ensure our users can pick up their code bindings and be able to use them without any modifications
So that our developer experience is as seamless as possible, and communication with Kafka and the schema registry works correctly.
Notes:
We currently need to do some adjustment to our python code bindings before these are pushed to local PyPi. This will ensure producers/consumers will use matching schemas with the ones produced in the schema registry.
The main issue is that the protobuf library for python includes the google common code bindings as well, taking precedence over our own descriptor.proto. Unfortunately, the internal descriptor field seem to be different, and schema registry doesn't like that.
By doing some tweaks to our produced code bindings, however, we can make this work. Steps to modify the code bindings:
AFTER generating the python code bindings, we should:
rename the google/protobuf folder into google_test/protobuf
any generated code binding importing from our google.protobuf needs to have their import changed, by pointing to google_test/protobuf - only the import of a file we own needs to be changed"
Here are the descriptors in a zip file, ours is called our_descriptor_pb2.py
, google one is called your_descriptor_pb2.py
.
descriptors.zip
Note that the Google descriptor is 107 KB and our descriptor is 119 KB.
Our descriptor was generated using buf 1.1.0
and protoc 3.21.12
as seen here: https://github.com/Atheuz/test-protobuf-schema-error/tree/master/build.
Note that if you generate it with protoc 3.21.12
in the build/external/google/protobuf
directory using the command protoc -I=. --python_out=. descriptor.proto
you end up with a small descriptor_pb2.py
file of like 16 KB, compared to when you generate it using buf
in the build
directory using the command buf generate --config=buf.yaml --template=buf.gen.yaml
where you end up with the 119 KB descriptor_pb2.py
file.
I'm not aware of any feature that supports swapping out descriptor.proto (other than a post-generation find/replace). What are you trying to change in it?
Hi, I also have the same issue here. I have a project that needs to be updated with proto schema, so I have a job for updating and replacing the schema. when the proto changes, I get the following error but if it does not change, its works fine!
TypeError: Couldn't build proto file into descriptor pool: duplicate symbol
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive
because the last activity was over 90 days ago.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.
This issue was closed and archived because there has been no new activity in the 14 days since the inactive
label was added.