iPhone Mach-O binaries, string storage, __TEXT/__DATA
Asked Answered
R

1

6

I am attempting to read constant (or initilization) strings from an iPhone Mach-O binary file. I understand that the 3 relevant segment.sections are _TEXT._cstring _TEXT._ustring and _DATA._cfstring. Howver, even though I know the string information is stored in these three blocks of data, which I have extracted, I can not make any sense of it, and it all looks like garbage - I do not see any recognizable character strings. Can anyone shed some light on this and give me an idea of what steps need to be take to read the string data?

I have looked at some code (GetAddrOfConstantCFString() from http://llvm.org/svn/llvm-project/cfe/trunk/lib/CodeGen/CodeGenModule.cpp), but again, couldn't quite relate it to what I see in the binaries.

In my case the sizes of the sections in question are:

__TEXT.__cstring (99 K-bytes)
__TEXT.__ustring (<200 bytes)
__DATA.__cfstring (29 K-bytes)

To give you an idea, the first 32 bytes of the __cfstring section, which I though would contain the actual strings looks like this:

Dump _DATA._cfstring

00  00  00  00  c8  07  00  00  74  02  0d  00  15  00  00  00
00  00  00  00  c8  07  00  00  8c  02  0d  00  01  00  00  00
...

Thanks a lot for your help!

Rhinencephalon answered 10/10, 2011 at 3:29 Comment(3)
Thinking its targeted for iPhone, that might be compressed somehow.Pisci
I think the __cfstring section would contain object data only, which means it would have pointers the the __cstring section, which then contains the raw string characters.Diviner
ughoavgfhw, that sounds very plausible, and would explain why __cstring is larger than the other sections. The issue is that the __cstring data looks like garbage. I was trying to decode it as if it was UTF-16, but it didn't produce a single normal character string.Rhinencephalon
R
3

Well, I've found the answer.

1) the files are generally encrypted (this can be tested with otool -l prog_file|grep -i crypt ). Not all sections are encrypted but usually the first block including _TEXT._text (prog code) and _TEXT._cstring are. The _DATA._cfstring section was not encrypted in my case.

2) as expected __cfstring consists of 16-byte structures (NSConstantString), where the 3rd word is a pointer to memory where _TEXT._cstring is loaded. The 4th word is the length.

So in real life the trick is to decrypt the file first, and then all is visible and accessible. I still didn't get around doing it properly, but dumped a piece of memory in gdb, which then replaced the relevant section in the file.

Rhinencephalon answered 24/10, 2011 at 3:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.