Why is LLVM AliasAnalysis unable to precisely decide alias relationships between pointers that are dereferenced twice?

About

Asked 23/5, 2022 at 21:56 Answered 23/5, 2022 at 21:56

I'm very confused by the LLVM AliasAnalysis implementation. Say I have this program:

int* key = malloc(4);
*key = 10;
*key = 11;

It gets transformed to IR code like this:

  %3 = call noalias i8* @malloc(i64 4) #2
  %4 = bitcast i8* %3 to i32*
  store i32* %4, i32** %2, align 8
  %5 = load i32*, i32** %2, align 8
  store i32 10, i32* %5, align 4
  %6 = load i32*, i32** %2, align 8
  store i32 11, i32* %6, align 4

Than I ask LLVM to print out the alias relationship between %5 and %6, by using the function static_cast<uint16_t>(AA_->getModRefInfo(FirstStore, MemoryLocation(SecondStorePointer))). It then shows that they may alias (as ModRefInfo::Mod) with each other. Why is LLVM unable to detect that they must alias each other? Is there any way I can fix it?

Interplay answered 23/5, 2022 at 21:56 Comment(6)

Just some guesses ... What [ultimately] matters is optimization/code generation. Is mayalias sufficient in code generation to elide the *key = 10;? That is, during a later stage, mayalias gets "promoted" based on a more "global" analysis. What happens if you do: volatile int *key = malloc(sizeof(*key));? It probably should not alias. OR int foo; int *key = &foo; OR int foo; int * const key = &foo; OR int * const key = malloc(sizeof(*key)); ??? – Edema 24/5, 2022 at 2:6

As a really wild guess, I wonder if the concern is that key might point to itself, in which case %5 and %6 really would be different. That should be excluded both by the strict aliasing rule and by the fact that malloc cannot return a pointer to any existing object, but maybe for some reason it isn't being picked up? – Languish 24/5, 2022 at 4:48

They may alias because another thread may change the memory location before the last load. – Tetroxide 24/5, 2022 at 7:34

@NateEldredge: LLVM rides a weird line between being a hosted and freestanding implementation, since it doesn't come with its own implementation of malloc(). Further, even if LLVM were to add code to recognize that in this particular scenario it would not be possible for key to alias itself, it would be hard to make such code be broadly applicable while ensuring that it was never applied in any circumstance where a pointer might alias its target. – Essive 24/5, 2022 at 23:20

@arnt: C compilers are not required to allow for such possibilities with objects that aren't qualified volatile. – Essive 24/5, 2022 at 23:22

That LLVM analysis isn't allowed to assume that the source language is C, anyway. It has to use IR's volatility rules and can't attach any particular meaning to the function name malloc. – Tetroxide 25/5, 2022 at 7:39

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags