Static linking with generated protobufs causes abort
Asked Answered
N

3

10

I have a project which compiles c++ generated protobuf serializers into a static library. An executable links against this library, and an .so (.dll) does too. The executable later loads the .so file. When that happens, I get:

[libprotobuf ERROR /mf-toolchain/src/protobuf-3.0.0-beta-1/src/google/protobuf/descriptor_database.cc:57] File already exists in database: mri.proto
[libprotobuf FATAL /mf-toolchain/src/protobuf-3.0.0-beta-1/src/google/protobuf/descriptor.cc:1128] CHECK failed: generated_database_->Add(encoded_file_descriptor, size): 
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: generated_database_->Add(encoded_file_descriptor, size): 
Aborted (core dumped)

Just to be quite clear, I have one static library A, which is linked to program P and shared library S. Later, P loads S, and I get the error above.

I looked at similar errors on Stack Overflow and google in general, but I'm quite sure that I'm only linking against the library, not recompiling the source. As far as I can tell, this should mean that the compiled-in data is the same.

Also note: this problem only happens on Linux. This works fine on Windows and OS X.

Nagpur answered 8/10, 2015 at 14:0 Comment(0)
B
10

The problem is that your static library contains a file mri.pb.cc which, in its global initializers, is registering type descriptors into the global descriptor database maintained by libprotobuf. Because your static library is loaded into your program twice, this initializer is running twice, but because you have only one copy of libprotobuf in your process, both initializers are registering into the same global database, and it is detecting a conflict.

To fix this problem, you need to change your static library into a shared library, which both the main program and the dynamically-loaded library depend on.

I am not sure why you see different behavior on Windows or OSX. My best guess is that on these platforms, you are actually linking two separate copies of libprotobuf into your program -- one in the main executable and one in the dynamically-loaded library. Thus there are two copies of the descriptor database and no conflict. However, you are likely to see much more subtle problems here. If you ever transfer protobuf object pointers between the main program and the dynamically-loaded module (without serializing and then parsing again) then you could end up having a protobuf object created by one copy of the library but being used with another copy (and therefore a different copy of the descriptor database) which will confuse the library and cause weird things to happen.

Alternatively, if you don't ever pass protobuf objects across the boundary, you might be able to "fix" the problem on Linux by linking libprotobuf statically, in order to get two copies as described above. But this is pretty dangerous; I wouldn't recommend it.

Baggott answered 9/10, 2015 at 6:11 Comment(6)
I was linking libprotobuf statically, and that didn't fix it. Is there any problem with static linking as long as I use the lite runtime and never pass around pointers?Nagpur
@Nagpur Generally, Linux's approach to dynamic linking does not play well with loading multiple copies of the same symbols into the same process. It's possible that the copy of libprotobuf in the main executable is overriding the one in the dynamic library, so they're both sharing one copy, hence the problem. You will probably have to link libprotobuf and your library dynamically to completely avoid multiple copies.Baggott
Turns out that wasn't the problem. This bug was recently fixed, and had to do with a poorly designed static initializer.Nagpur
hi @Christopher, I ran into the same problem when I add two separate static lib contains pb on iOS, could you tell me which version did google protobuf fix this error ? it's very helpful if any link is provided. thank you.Humic
@johnma: I think it is protobuf3-beta3. They marked my bug closed a while back, although I can't seem to find it.Nagpur
@Christopher: finally I fix this by changing one of the protobuf name space from google::xxx to mypb:xxx ,and this fix my problem.Humic
I
2

This error appeared for me in the context of an executable linking to two libraries (LibA, LibB) that both happened to compile the same proto message, where LibB depends on LibA and links to it.

I found myself confronted by this situation due to the fact that LibA previously held no dependency on any of the protobuffer framework and LibB built the full set of relevant proto messages for this in-house tooling application to communicate with another application. With a new release of LibA it required new dependencies on two other libraries that compile various proto messages (LibC, LibD). The problem manifested equally from both LibC and LibD, I'll discuss LibC since the solution was identical.

At load time the application loaded LibC, and eventually it came to load the uppermost module LibB and that’s when an abort would be triggered down in LogMessage::Finish() in common.cc. I discovered who was doing this double loading by setting a break point a few levels up from the abort context. Here’s the relevant source to consider for the double loading proto message I'm calling SomeMessage

void LibC_AddDesc_SomeMessage_2eproto() {
  static bool already_here = false; // <--- Breakpoint Set Here
  if (already_here) return;
  already_here = true;

Breakpoint Hit 1: Loading LibC

LibC.dll!MyNamespace::LibC_AddDesc_SomeMessage_2eproto()  Line 415  C++
LibC.dll!MyNamespace::LibC_AddDesc_ParentMessage_2eproto()  Line 388    C++
LibC.dll!MyNamespace::StaticDescriptorInitializer_ParentMessage_2eproto::StaticDescriptorInitializer_ParentMessage_2eproto()  Line 494  C++
LibC.dll!MyNamespace::`dynamic initializer for 'static_descriptor_initializer_ParentMessage_2eproto_''()  Line 495 + 0x21 bytes C++
msvcr100d.dll!_initterm()  + 0x2c bytes 
LibC.dll!_CRT_INIT(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 289  C
LibC.dll!__DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 506 + 0x13 bytes   C
LibC.dll!_DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 477 C
ntdll.dll!LdrpRunInitializeRoutines()  + 0x1e8 bytes    
ntdll.dll!LdrpInitializeProcess()  - 0x14c9 bytes   
ntdll.dll!string "Enabling heap debug options\n"()  + 0x29a99 bytes 
ntdll.dll!LdrInitializeThunk()  + 0xe bytes 

I could see during the loading of LibC that the breakpoint was hit twice and the static variable already_here was set from false to true, and held at true and would skip the registration of this message.

Breakpoint Hit 2: Loading LibB

When this library attempted its load the variable already_here would be reinitialized to false and we'd proceed to attempt to register this message a second time which triggered the abort.

LibB.dll!MyNamespace::LibC_AddDesc_SomeMessage_2eproto()  Line 415  C++
LibB.dll!MyNamespace::LibC_AddDesc_ParentMessage_2eproto()  Line 388    C++
LibB.dll!MyNamespace::LibC_AddDesc_FullMessage_2eproto()  Line 219  C++
LibB.dll!MyNamespace::StaticDescriptorInitializer_FullMessage_2eproto::StaticDescriptorInitializer_FullMessage_2eproto()  Line 358  C++
LibB.dll!MyNamespace::`dynamic initializer for 'static_descriptor_initializer_FullMessage_2eproto_''()  Line 359 + 0x21 bytes   C++
msvcr100d.dll!_initterm()  + 0x2c bytes 
LibB.dll!_CRT_INIT(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 289  C
LibB.dll!__DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 506 + 0x13 bytes   C
LibB.dll!_DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 477 C
ntdll.dll!LdrpRunInitializeRoutines()  + 0x1e8 bytes    
ntdll.dll!LdrpInitializeProcess()  - 0x14c9 bytes   
ntdll.dll!string "Enabling heap debug options\n"()  + 0x29a99 bytes 
ntdll.dll!LdrInitializeThunk()  + 0xe bytes 

... and we'd wind up in stubs/common.cc at the abort line

void LogMessage::Finish() {
  bool suppress = false;

  if (level_ != LOGLEVEL_FATAL) {
    InitLogSilencerCountOnce();
    MutexLock lock(log_silencer_count_mutex_);
    suppress = internal::log_silencer_count_ > 0;
  }

  if (!suppress) {
    internal::log_handler_(level_, filename_, line_, message_);
  }

  if (level_ == LOGLEVEL_FATAL) {
    abort(); // <----- runtime crash!
  }
}

And to std::err you'd find the following text...

libprotobuf ERROR descriptor_database.cc:57] File already exists in database: SomeMessage.proto
libprotobuf FATAL descriptor.cc:860] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):

The solution was simple, I opened the LibC project and searched for pb and removed those proto messages from LibB. The same was done for LibD.

Idealist answered 10/8, 2016 at 22:34 Comment(0)
C
1

I also faced this issue resolved it just now. I forgot to link pthread. After linking pthread, the problem went away. Posting here incase someone else missed it.

Camacho answered 22/6, 2022 at 11:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.