branch-prediction Questions

1

Solved

For those that have already measured or have deep knowledge about this kind of considerations, assume that you have to do the following (just to pick any for the example) floating-point operator: ...

1

Solved

I am working on a pipeline with 6 stages: F D I X0 X1 W. I am asked how many instructions need to be killed when a branch miss-predict happens. I have come up with 4. I think this because the bran...
Preoccupy asked 20/3, 2020 at 21:7

2

I am trying to understand how does a branch prediction unit work in a CPU. I have used papi and also linux's perf-events but both of them do not give accurate results (for my case). This is my co...
Mickeymicki asked 17/2, 2020 at 14:51

6

Solved

Is there any portable way of doing branch prediction hints? Consider the following example: if (unlikely_condition) { /* ..A.. */ } else { /* ..B.. */ } Is this any different than doing: ...
Ostentation asked 13/9, 2010 at 17:35

8

Solved

For the Intel architectures, is there a way to instruct the GCC compiler to generate code that always forces branch prediction a particular way in my code? Does the Intel hardware even support this...
Laflamme asked 8/5, 2015 at 18:54

1

Solved

I've written this very simple Rust function: fn iterate(nums: &Box<[i32]>) -> i32 { let mut total = 0; let len = nums.len(); for i in 0..len { if nums[i] > 0 { total += nums[i...

2

Solved

Alright, so I know that if a particular conditional branch has a condition that takes time to compute (memory access, for instance), the CPU assumes a condition result and speculatively executes al...

1

Before you down-vote or start saying that gotoing is evil and obsolete, please read the justification of why it is viable in this case. Before you mark it as duplicate, please read the full questio...
Putscher asked 8/11, 2019 at 21:47

1

Solved

Some references: This is a follow-up on this Why is processing a sorted array faster than processing an unsorted array? The only post in r tag that I found somewhat related to branch prediction was...

3

Solved

Branch penalty in pipeline results from non-zero distance between ALU and IF. What does it mean by this statement?
Nazler asked 2/6, 2019 at 6:12

1

Solved

I have this memchr code that I'm trying to make non-branching: .globl memchr memchr: mov %rdx, %rcx mov %sil, %al cld repne scasb lea -1(%rdi), %rax test %rcx, %rcx cmove %rcx, %rax ret ...

1

I understand there is a branch predictor in modern CPU designs trying to guess which branch to go. Assuming there is a jump instruction that will transfer control flow to either basic block A or b...
Blas asked 6/8, 2019 at 4:21

2

Solved

I have seen code like this in many answers, and the authors say this is branchless: template <typename T> inline T imax (T a, T b) { return (a > b) * a + (a <= b) * b; } But is this ...
Gerardgerardo asked 8/12, 2015 at 15:1

1

Solved

Preliminary information: according to the recent ISO C++ Committee Trip Report, the [[ likely ]] and [[ unlikely ]] attributes for conditional branching will be added in C++20 and is available in t...
Chapin asked 18/7, 2019 at 12:22

2

Solved

When considering a conditional function call in a critical section of code I found that both gcc and clang will branch around the call. For example, for the following (admittedly trivial) code: in...
Magisterial asked 25/2, 2019 at 14:37

1

Solved

How do modern CPUs like Kaby Lake handle small branches? (in code below it is the jump to label LBB1_67). From what I know the branch will not be harmful because the jump is inferior to the 16-byte...

3

Solved

I have done some reading about Spectre v2 and obviously you get the non technical explanations. Peter Cordes has a more in-depth explanation but it doesn't fully address a few details. Note: I have...
Probe asked 5/2, 2019 at 18:49

2

Solved

A hardware interrupt occurs to a particular vector (not masked), CPU checks IF flag and pushes RFLAGS, CS and RIP to the stack, meanwhile there are still instructions completing in the back end, on...

1

I implemented a physics simulation in Python (most of the heavy lifting is done in numerical libraries anyways, thus performance is good enough). Now that the project has grown a bit, I added extra...
Girondist asked 26/10, 2018 at 9:14

1

Solved

It's my understanding that at the beginning of a processor's pipeline, the instruction pointer (which points to the address of the next instruction to execute) is updated by the branch predictor af...

3

From here I know Intel implemented several static branch prediction mechanisms these years: 80486 age: Always-not-taken Pentium4 age: Backwards Taken/Forwards Not-Taken Newer CPUs like Ivy Bridge...

1

Solved

I read the famous Why is it faster to process a sorted array than an unsorted array? and I decided to play around and experiment with other languages such as Swift. I was surprised by the run time ...
Baalbeer asked 29/6, 2018 at 21:15

1

Solved

I'm trying to understand in detail what happens to instructions in the various stages of the skylake CPU pipeline when a branch is mis-predicted, and how quickly instructions from the correct branc...

3

Solved

When talking about the performance of ifs, we usually talk about how mispredictions can stall the pipeline. The recommended solutions I see are: Trust the branch predictor for conditions that usu...

2

In Volume 3 of the Intel Manuals it contains the description of a hardware event counter: BACLEAR_FORCE_IQ Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is als...
Selfcontained asked 26/7, 2015 at 23:12

© 2022 - 2024 — McMap. All rights reserved.