How to convert 32-bit compiled binary to 64-bit [closed]
Asked Answered
A

1

11

Background: We have acquired a software product that builds to a 32-bit Windows application in Visual Studio. We wish to port this application to 64-bit.

A mission-critical component of this code is a black-box static library (.a file) originally built using gFortran by a third party. The original developer has since passed away, and the Fortran source we were able to get was incomplete and not the version this library was built off of (and contains critical bugs not present in the compiled library). They did not use a VCS.

Problem: I would like to create a 64-bit static library whose code is functionally equivalent to the 32-bit static library we have.

What I've Tried:

  • Using the Snowman decompiler to get C++ source code to recompile in 64-bit. This proved impossible because the code that was generated uses low-level intrinsic functions that appear to be gcc-specific. It likely wouldn't work anyway because such intrinsics would compile to code that isn't functionally equivalent in 64-bit. I'd likely need a better decompiler.
  • Apparently x86 assembly is valid x86_64 assembly, so I looked briefly into why I couldn't just run the assembly through a 64-bit assembler. Turns out the ABI is different in 64-bit, so the calling convention won't match. MAYBE I could manually convert the function calls to the proper convention and leave the rest the same, but that might be an intractable problem. Opinions?
Anishaaniso answered 19/6, 2017 at 16:15 Comment(15)
"because the code that was generated uses low-level intrinsic functions that appear to be gcc-specific" Such as? There is very likely a translation of these to the compiler of your choice (MSVC?). I mean, everyone needs a better decompiler, but those aren't exactly available. The whole point of intrinsics is that they decouple you from the platform, so if you used them in a library targeting x86-64, then you'd get x86-64 code. Plus, like you said, x86-32 is very similar to x86-64, so it would be a simple matter of translating the code. Too broad for SO; hire a programmer.Crary
@CodyGray The ones I saw were __return_address() and __zero_stack_offset(). Rather than the generated code using its own function arguments, it was manually replicating the assembly code for reading them off the stack. This won't fly in 64-bit because the ABI is different. You are correct that __return_address() might have the MSVC equivalent _ReturnAddress(), but I can't find an equivalent for __zero_stack_offset(). They are very poorly documented. Thanks for your input.Anishaaniso
"How to convert 32-bit compiled binary to 64-bit" - by recompiling the source code to 64bit.Synonym
@CodyGray "hire a programmer" <- this!!! The company you work for decided to put a mission critical component in a single hand that obviously wasn't qualified enough to make it maintainable. At any time someone can get hit by a bus, so it's just stupidity when a company does this. Generally it's a first hire or some shit, and everything they say is gold. But the company sexually penetrated itself when it let this happen. Tell them they need to hire someone that can re-write the lost software and DOCUMENT IT! And put it in a damn repository! Jeesh!!!Gipon
@CrazyEddie I wish someone would have told that to the other company years ago before we acquired them.Anishaaniso
Well, then the one you work for made the purchase without really reviewing the assets they were buying. So bad on them for that too. They just have to bite the bullet. Any sort of finagling the binary on your part is not going to be trustable. They just need to fix this glaring problem with their new product and acquired company.Gipon
Note that 32 bit machine code is generally not valid 64 bit machine code as many instructions operate differently in 64 bit mode.Latinist
You acquired that other company, now you are them. No one else to blame.Yingling
Since this is on hold as too broad, does anyone have a suggestion for how I can make it more specific and still find an answer to the problem, if any exist?Anishaaniso
The biggest issue here is that you don't have a question—other than maybe "What can I do?" The answer to that is far too broad, for reasons that should already be clear to you, having previously tried to investigate this on your own. The only real answer to this question would be a comprehensive discussion of how to decompile/reverse-engineer and convert code—which is more than could be covered in a book. If you have specific questions about narrowly-defined problems, then you could ask about those. If you were just looking for confirmation this will be very hard, then you've got it.Crary
I mean, if you want to ask about what the MSVC equivalent is for __zero_stack_offset(), then that would be a reasonable question. You'd need to post example disassembly output that showed the use of this intrinsic in context, but then someone could either tell you what the precise equivalent is, or explain what this intrinsic means so that a workaround could be found. That would be sufficiently scoped for SO. But that probably won't help you very much in the way of actually solving this problem. You can't do it one-question-at-a-time. "Hire a programmer" to convert it was popular advice.Crary
The second part of "What I've tried" can be solved. If it is just about the ABI and calling conventions, it is okay. Internal functions can still use the 32 bit ABI. You just have to identify the external calls that go in/out of the library and write wrappers for them to translate the ABI. About your statement that 32 bit assembly is valid 64 bit assembly, umm, mostly! There are some instructions that differ at the byte representation level. You could try though.Hindbrain
Also, ofcourse you have to make sure that this library is loaded into address space < 4GB, but that shouldn't be a problem. You can configure your loader/static linker.Hindbrain
Thanks @CodyGray. I don't think I'm going to be able to narrow this one down, so it's good to close. Unless you think "Is there a known/accepted strategy to accomplish this?" is a valid question. But there's a lot of value in the comment responses here.Anishaaniso
@Ajay That's an interesting idea. I assume that as long as the assembly is the same, the byte representation is functionally irrelevant? Could be an interesting project. My assembly is rusty, and was all on MIPS, but it's something to try.Anishaaniso
P
11

You could keep the 32-bit binary library but load it into a 32-bit host process, and use some kind of IPC (shared memory, named pipes, local-loopback network connection, etc) to transfer data to/from your 64-bit process.

Another advantage to this approach is that if the Fortran code crashes then it will only bring down the child host process, not your main application and your program can instantly fire it up again; and if it's a single-threaded Fortran program then you could spin up several instances for multi-core parallelism.

Palsy answered 19/6, 2017 at 16:49 Comment(10)
I was afraid you might suggest this...Anishaaniso
That's a hack (at best). Have fun maintaining that going forward.Synonym
@JesperJuhl: That's not a hack. It's essentially, how (and why) COM surrogate processes work. Launching surrogate processes is a solved problem. IPC is a solved problem.Doody
@Doody It's a hack when you don't have the source code but essentially just a black box. You have no way to know that the communication you set up will work in all cases, you have no way to debug problems, you have no way to upgrade dependencies with any real faith that things will keep working. Etc etc. It's a hack.Synonym
@JesperJuhl: This comment may apply to the question, not this answer. You know the interface, and that's all you need to use a black box in the way explained in this answer. This isn't any more hacky than what the OP is doing already.Doody
@Doody I get your point but respectfully disagree (except on the "not more hacky than what OP is already doing" part). Can we leave it at that and just agree to disagree?Synonym
@JesperJuhl - it's the only real option other than recreating the library available. This option would at least allow them to get a product out the door--if they trust the library and of course they should not but may have to. Then they can do the huge amount of work of recreating a central component that--being in FORTRAN--is probably full of all kinds of math in order to solve "something". It is a hack, and of course there's great danger the company would just choose to keep trying to use it...but that's their decision to make.Gipon
@Crazy Eddie You really consider it responsible to "get a product out the door" that's based on a binary blob you can't debug and don't know exactly how works? I don't.Synonym
Remember, they have that already, @Jesper. It's just in 32 bits.Crary
@JesperJuhl - No, I do not. But I live in the real world where companies make stupid decisions like that and it's my job to tell them all the options I can think of. Edit: And the costs of each.Gipon

© 2022 - 2024 — McMap. All rights reserved.