How does a PE file get mapped into memory?
Asked Answered
D

4

5

So I have been reasearching the PE format for the last couple days, and I still have a couple of questions

  1. Does the data section get mapped into the process' memory, or does the program read it from the disk?

  2. If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )

  3. Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?

Duong answered 28/1, 2014 at 11:58 Comment(0)
W
6

Does the data section get mapped into the process' memory

Yes. That's unlikely to survive for very long, the program is apt to write to that section. Which triggers a copy-on-write page copy that gets the page backed by the paging file instead of the PE file.

how can the process aqquire the offset of the section?

The linker already calculated the offsets of variables in the section. It might be relocated, common for DLLs that have an awkward base address that's already in use when the DLL gets loaded. In which case the relocation table in the PE file is used by the loader to patch the addresses in the code. The pages that contain such patched code get the same treatment as the data section, they are no longer backed by the PE file and cannot be shared between processes.

Is there any way the get the entry point of a process

The entire PE file gets mapped to memory, including its headers. So you can certainly read IMAGE_OPTIONAL_HEADER.AddressOfEntryPoint from memory without reading the file. Do keep in mind that it is painful if you do this for another process since you don't have direct access to its virtual address space. You'd have to use ReadProcessMemory(), that's fairly little joy and unlikely to be faster than reading the file. The file is pretty likely to be present in the file system cache. The Address Space Layout Randomization feature is apt to give you a headache, designed to make it hard to do these kind of things.

Wampum answered 28/1, 2014 at 13:21 Comment(3)
I've tried reading the process's memory, and then casting the first bytes to a PIMAGE_DOS_HEADER, but it didn't seem to work. When the file gets mapped into the memory the first bytes aren't the MZ header anymore. Any ideas on how to do it?Duong
I did warn you about ASLR. I have otherwise nothing to look at, start another question.Wampum
may be you find this question interestingSubzero
K
1

Does the data section get mapped into the process' memory, or does the program read it from the disk?

It's mapped into process' memory.

If it does get mapped into its memory, how can the process aqquire the offset of the section? ( And other sections )

By means of a relocation table: every reference to a global object (data or function) from the executable code, that uses direct addressing, has an entry in this table so that the loader patches the code, fixing the original offset. Note that you can make a PE file without relocation section, in which case all data and code sections have a fixed offset, and the executable has a fixed entry point.

Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?

Not sure, but if by "not touching" you mean not even reading the file, then you may figure it out by walking up the stack.

Kunming answered 28/1, 2014 at 12:19 Comment(0)
F
1
  1. Yes, all sections that are described in the PE header get mapped into memory. The IMAGE_SECTION_HEADER struct tells the loader how to map it (the section can for example be much bigger in memory than on disk).

  2. I'm not quite sure if I understand what you are asking. Do you mean how does code from the code section know where to access data in the data section? If the module loads at the preferred load address then the addresses that are generated statically by the linker are correct, otherwise the loader fixes the addresses with relocation information.

  3. Yes, the windows loader also loads the PE Header into memory at the base address of the module. There you can file all the info that was in the file PE header - also the Entry Point.

I can recommend this article for everything about the PE format, especially on relocations.

Fancie answered 28/1, 2014 at 12:33 Comment(0)
E
0

Does the data section get mapped into the process' memory, or does the program read it from the disk?

Yes, everything before execution by the dynamic loader of operating systems either Windows or Linux must be mapped into memory.

If it does get mapped into its memory, how can the process acquire the offset of the section? ( And other sections )

PE file has a well-defined structure which loader use that information and also parse that information to acquire the relative virtual address of sections around ImageBase. Also, if ASLR - Address randomization feature - was activated on the system, the loader has to use relocation information to resolve those offsets.

Is there any way the get the entry point of a process that has already been mapped into the memory, without touching the file on disk?

NOPE, the loader of the operating system for calculation of OEP uses ImageBase + EntryPoint member values of the optional header structure and in some particular places when Address randomization is enabled, It uses relocation table to resolve all addresses. So we can't do anything without parsing of PE file on the disk.

Eachern answered 1/6, 2018 at 12:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.