Is the address checked by the memory alignment check mechanism a effective address, a linear address or a physical address?
Asked Answered
N

1

7

I am studying the issue of alignment check. But I don't know whether the processor is checking on effective addresses, linear addresses or physical addresses, or all checks.

For example, the effective address of a data has been aligned, but the linear address formed by adding the base address of the segment descriptor is no longer aligned, and the processor throws an #AC exception at this time.

Narrows answered 16/6, 2021 at 10:38 Comment(8)
Good question. Pages are aligned, so there is no difference between linear and physical but segment bases are byte granular although they are recommended to be aligned. The manual doesn't say. Maybe do a test :)Burks
@Burks Thank you for your answer. I currently guess that the processor will not check the alignment of the effective address, because the compiler can solve its alignment problem very well. And as you said, for today’s OS, there is no difference between linear and physical addresses. There is also a page-level mapping relationship between virtual and physical memory. Therefore, if linear (virtual) addresses are aligned, then physical addresses must be aligned. In summary, I think the alignment check mechanism is used to maintain the alignment of linear addresses.Narrows
@Burks Wasn't it possible to set up byte-sized pages with some flag? I kinda forgot about all these details.Cockerel
@Cockerel I don’t seem to see the flag that can control the page size~~Narrows
@fuz: Page sizes no; probably you're thinking of segment limits, which can be scaled by 4k or by 1. wiki.osdev.org/Global_Descriptor_TablePathogenesis
It seems #AC was meant to trap unnecessarily slow memory accesses so the programmers can fix their software to be more bus friendly. It wouldn't make sense for anything but the physical address to be relevant for that check.Bench
@MichaelKarcher That seems to be the only practical answer, but I can't find any wording in the CPU reference manual that actually says so.Shroud
@RaymondChen To the contrary: The CPU reference manual suggest that #AC is meant to trap mis-tagged pointers. This makes most sense if it is applied to the offset inside the segment. I'm likely going to test it on real hardware some time soon.Bench
R
6

TL;DR

I think it's the linear address.

The test result is A B B A C B (by row)

Keep reading for the test methodology and the test code.


It's not the effective address (aka the offset)

To test this it suffices to use a segment with a base that is not aligned.
In my test, I've used a 32-bit data segment with a base of 1.

The test is a "simple" legacy (i.e. non-UEFI) bootloader that will create said descriptor and test accessing the offsets 0x7000 and 0x7003 with DWORD width.
The former will generate an #AC, the latter won't.

This demonstrates that it's not the offset alone that is checked, because 0x7000 is an aligned offset that still faults with a base of 1.

This is expected.

I have a tradition of using a minimal output for the tests, so an explanation is mandatory.

First, six blue As are written in six consecutive rows in the VGA buffer.
Then before executing a load, a pointer is set to each of these As.
The #AC handler will increment the pointed-to byte.
So, if a row contains a B, the access generated an #AC.

The first four rows are used for:

  1. Access using a segment with base 0 and offset 0x7000h. As expected, no #AC
  2. Access using a segment with base 0 and offset 0x7003h. As expected, #AC
  3. Access using a segment with base 1 and offset 0x7000h. This does generate an #AC thereby demonstrating that it's either the linear of the physical address that's checked.
  4. Access using a segment with base 1 and offset 0x7003h. This doesn't generate an #AC, confirming point 3.

The next two rows are used to check the linear address vs the physical address.

It's not the physical address: #AC instead of #PF

The #AC test only alignments up to 16 bytes but a linear and a physical address share the same alignment up to 4KiB at least.
We would need a memory access that requires a data structure aligned on, at least, 8KiB to test if it's the physical or the linear address that's used for the check.

Unfortunately, there is no such access (yet).

I thought I could still gather some insight by checking what exception is generated when a misaligned load target an unmapped page.
If a #PF is generated, the CPU will first translate the linear address and will then check. On the other way around, if an #AC is generated, the CPU will check before translating (remember that the page is not mapped).

I modified the test to enable page, map the minimum amount of pages and handle a #PF by incrementing the byte under the pointer by two.

When a load is executed, the corresponding A will either become a B if an #AC is generated or a C if a #PF is generated.
Note that both are faults (eip on the stack points to the offending instruction) but both handlers resume from the next instruction (so each load is executed only once).

These are the meaning of the last two rows:

  1. Access to an unmapped page using a segment with base 1 and offset 0x7003h. This generates a #PF as expected (the access is aligned so the only exception possible here is a #PF).
  2. Access to an unmapped page using a segment with base 1 and offset 0x7000h. This generates an #AC, therefore the CPU checks the alignment before attempting to translate the address.

Point 6 seems to suggest that the CPU will perform the check on the linear address since no access to the page table is done.
In point 6 both exceptions could be generated, the fact that #PF is not generated means that the CPU hasn't attempted translating the address when the alignment check is performed. (Or that #AC logically takes precedence. But likely the hardware wouldn't do a page walk before taking the #AC exception, even if it did probe the TLB after doing the base+offset calculation.)

Test code

The code is messy and more cumbersome than one may expect.
The main hindrance is #AC only working at CPL=3.
So we need to create the CPL=3 descriptor, plus a TSS segment and a TSS descriptor.
To handle the exception we need an IDT and we also need paging.

BITS 16
ORG 7c00h

  ;Skip the BPB (My BIOS actively overwrite it)
  jmp SHORT __SKIP_BPB__

  ;I eyeballed the BPB size (at least the part that may be overwritten)
  TIMES 40h db 0

__SKIP_BPB__:
  ;Set up the segments (including CS)
  xor ax, ax
  mov ds, ax
  mov ss, ax
  xor sp, sp
  jmp 0:__START__

__START__:
  ;Clear and set the video mode (before we switch to PM)
  mov ax, 03h
  int 10h
  
  ;Disable the interrupts and load the GDT and IDT
  cli
  lgdt [GDT]
  lidt [IDT]
  
  ;Enable PM
  mov eax, cr0
  or al, 1
  mov cr0, eax
  

  ;Write a TSS segment, we zeros 104h DWORDs and only set the SS0:ESP0 fields
  mov di, 7000h
  mov cx, 104h
  xor ax, ax
  rep stosd
  
  mov DWORD [7004h], 7c00h    ;ESP0
  mov WORD [7008h], 10h       ;SS0
  
  
  ;Set AC in EFLAGS
  pushfd
  or DWORD [esp], 1 << 18 
  popfd
  
  ;Set AM in CR0
  mov eax, cr0
  or eax, 1<<18
  mov cr0, eax

  ;OK, let's go in PM for real
  jmp 08h:__32__
  
__32__:
  BITS 32

  ;Set the stack and DS
  mov ax, 10h 
  mov ss, ax 
  mov esp, 7c00h
  mov ds, ax
  
  ;Set the #AC handler
  mov DWORD [IDT+8+17*8], ((AC_handler-$$+7c00h) & 0ffffh) | 00080000h
  mov DWORD [IDT+8+17*8+4], 8e00h | (((AC_handler-$$+7c00h) >> 16) << 16)
  ;Set the #PF handler
  mov DWORD [IDT+8+14*8], ((PF_handler-$$+7c00h) & 0ffffh) | 00080000h
  mov DWORD [IDT+8+14*8+4], 8e00h | (((PF_handler-$$+7c00h) >> 16) << 16)

  ;Set the TSS
  mov ax, 30h
  ltr ax

  ;Paging is:
  ;7xxx -> Identity mapped (contains code and all the stacks and system structures)
  ;8xxx -> Not present
  ;9xxx -> Mapped to the VGA text buffer (0b8xxxh)
  ;Note that the paging structures are at 6000h and 5000h, this is OK as these are physical addresses

  ;Set the Page Directory at 6000h
  mov eax, 6000h
  mov cr3, eax
  ;Set the Page Directory Entry 0 (for 00000000h-00300000h) to point to a Page Table at 5000h 
  mov DWORD [eax], 5007h
  ;Set the Page Table Entry 7 (for 00007xxxh) to identity map and Page Table Entry 8 (for 000008xxxh) to be not present
  mov eax, 5000h + 7*4
  mov DWORD [eax], 7007h
  mov DWORD [eax+4], 8006h
  ;Map page 9000h to 0b8000h
  mov DWORD [eax+8],  0b801fh

  ;Enable paging
  mov eax, cr0 
  or eax, 80000000h
  mov cr0, eax

  ;Change privilege (goto CPL=3)
  push DWORD 23h            ;SS3
  push DWORD 07a00h         ;ESP3
  push DWORD 1bh            ;CS3
  push DWORD __32user__     ;EIP3
  retf 

__32user__:

  ; 
  ;Here we are at CPL=3
  ;

  ;Set DS to segment with base 0 and ES to one with base 1
  mov ax, 23h
  mov ds, ax
  mov ax, 2bh
  mov es, ax

  ;Write six As in six consecutive row (starting from the 4th)
  xor ecx, ecx 
  mov ecx, 6
  mov ebx, 9000h + 80*2*3   ;Points to 4th row in the VGA text framebuffer
.init_markers:
  mov WORD [ebx], 0941h
  add bx, 80*2
  dec ecx 
  jnz .init_markers

  ;ebx points to the first A
  sub ebx, 80*2 * 6

  ;Base 0 + Offset 0 = 0, Should not fault (marker stays A)
  mov eax, DWORD [ds:7000h]

  ;Base 0 + Offset 1 = 1, Should fault (marker becomes B)
  add bx, 80*2
  mov eax, DWORD [ds:7001h]

  ;Base 1 + Offset 0 = 1, Should fault (marker becomes B)
  add bx, 80*2
  mov eax, DWORD [es:7000h]

  ;Base 1 + Offset 3 = 4, Should not fault (marker stays A)
  add bx, 80*2
  mov eax, DWORD [es:7003h]

  ;Base 1 + Offset 3 = 4 but page not mapped, Should #PF (markers becomes C)
  add bx, 80*2
  mov eax, DWORD [es:8003h]

  ;Base 1 + Offset 0 = 1 but page not mapped, if #PF the markers becomes C, if #AC the markers becomes B
  add bx, 80*2
  mov eax, DWORD [es:8000h]

  ;Loop foever (cannot use HLT at CPL=3)
  jmp $
  

;#PF handler
;Increment the byte pointed by ebx by two
PF_handler:
  add esp, 04h        ;Remove the error code
  add DWORD [esp], 6  ;Skip the current instruction
  add BYTE [ebx], 2   ;Increment

  iret 

;#AC handler
;Same as the #PF handler but increment by one
AC_handler:
  add esp, 04h
  add DWORD [esp], 6
  inc BYTE [ebx]

  iret
  

  ;The GDT (entry 0 is used as the content for GDTR)
  GDT dw GDT.end-GDT - 1
      dd GDT
      dw 0
      
      dd 0000ffffh, 00cf9a00h   ;08 Code, 32, DPL 0
      dd 0000ffffh, 00cf9200h       ;10 Data, 32, DPL 0
      
      dd 0000ffffh, 00cffa00h       ;18 Code, 32, DPL 3
      dd 0000ffffh, 00cff200h       ;20 Data, 32, DPL 3
      dd 0001ffffh, 00cff200h       ;28 Data, 32, DPL 3, Base = 1

      dd 7000ffffh, 00cf8900h       ;30 Data, 32, 0 (TSS)

      .end: 

  ;The IDT, to save space the entries are set dynamically      
  IDT dw 18*8-1
      dd IDT+8
      dw 0
      

  ;Signature
  TIMES 510-($-$$) db 0
  dw 0aa55h

Does it make sense to check the linear address?

I don't think it's particularly relevant. As noted above, a linear and a physical address share the same alignment up to 4KiB.
So, for now, it doesn't matter at all.
Right now, accesses wider than 64 bytes would still need to be performed in chunks and this limit is set deep in the microarchitectures of the x86 CPUs.

Reubenreuchlin answered 17/6, 2021 at 15:38 Comment(5)
In this case, the CPU only translates the address of the first byte and then assumes the structure is continuous in physical memory. - That description would (wrongly) imply you could only use FXSAVE if your virtual pages map to contiguous physical pages, or you're using a large/hugepage. I think more likely is the the MS-ROM does a sequence of normal 32-byte or 64-byte store uops, similar to what rep stos would do (but of course taking data from SIMD/FPU state, not AL). So each store uop would do its own TLB access. Or when you say "assume", did you just mean for #AC purposes?Pathogenesis
@PeterCordes I think I am. I don't remember the exact structure (maybe it was not the FXSAVE area) but there was a case where the CPU would translate the starting address and then use it as a base. Later I'll see if I can find what exactly this case is (or if I remember wrong). Or maybe it was something about a structure spanning two pages? I think it was this latter case.Reubenreuchlin
@PeterCordes I couldn't find what I was looking for. I remove that paragraph to avoid writing nonsense :)Reubenreuchlin
Sounds plausible for something (especially something involved in paging or virtualization), but not for fxsave m512byte (or later xsave / xsaveopt). They're unprivileged instructions, and contiguous physical memory would potentially let user-space corrupt / overwrite memory that wasn't mapped to it, by starting one near the end of a page. e.g. a single page you got from mmap. So that pretty much rules that out.Pathogenesis
Thank you very much for your answer, I fully understand it!Narrows

© 2022 - 2024 — McMap. All rights reserved.