loop-unrolling Questions

2

Solved

I'm trying to assess the performance differences between OpenCL for AMD and Nvidia GPUs. I have a kernel which performs matrix-vector multiplication. I'm running the kernel on two different systems...

3

Solved

Consider this code: #include <iostream> typedef long xint; template<int N> struct foz { template<int i=0> static void foo(xint t) { for (int j=0; j<10; ++j) { foo<i+1&gt...

3

I see that Duff's device is just to do loop unrolling in C. https://en.wikipedia.org/wiki/Duff%27s_device I am not sure why it is still useful nowadays. Isn't that the compiler should be smart en...
Leucocratic asked 8/2, 2019 at 17:49

4

Solved

Venturing out of my usual VC++ realm into the world of GCC (via MINGW32). Trying to create a Windows PE that consists largely of NOPs, ala: for(i = 0; i < 1000; i++) { asm("nop"); } ...
Moxa asked 31/12, 2010 at 0:5

2

Solved

On g++ 4.9.2 and 5.3.1, this code takes several seconds to compile and produces a 52,776 byte executable: #include <array> #include <iostream> int main() { constexpr std::size_t size...
Arawakan asked 16/5, 2016 at 17:49

2

Solved

I just read the Java Magazine article Loop Unrolling. There the authors demonstrate that simple for loops with an int counter are compiled with loop unrolling optimization: private long intStride1(...
Bulgarian asked 18/3, 2021 at 22:21

2

Solved

Doing a small check, it looks like neither V8 nor spidermonkey unroll loops, even if it is completely obvious, how long they are (literal as condition, declared locally): const f = () => { ...
Uhlan asked 6/2, 2021 at 20:59

5

Consider the following simple example: struct __attribute__ ((__packed__)) { int code[1]; int place_holder[100]; } s; void test(int n) { int i; for (i = 0; i < n; i++) { s.code[i] = 1; }...
Levins asked 2/7, 2020 at 8:54

2

I am trying to selectively unroll the second loop in the following program: #include <stdio.h> int main() { int in[1000], out[1000]; int i,j; #pragma nounroll for (i = 100; i < 100...
Editheditha asked 5/12, 2014 at 6:38

1

Solved

I want to use the std::array to store the data of N-dimensional vectors and implement arithmetic operations for such vectors. I figured, since the std::array now has a constexpr size() member funct...
Poirer asked 11/1, 2019 at 14:26

2

Solved

First: I know what loop optimization is and how it works yet I found a case where I cannot explain the results. I created a prime number checker that calls modulo on each number from 2 to n - 1, s...
Xyster asked 19/10, 2018 at 12:16

2

Solved

What is the loop unrolling policy for JIT? Or if there is no simple answer to that, then is there some way i can check where/when loop unrolling is being performed in a loop? GNode child = null; ...
Toliver asked 30/8, 2011 at 12:35

1

Solved

I need to force the Metal compiler to unroll a loop in my kernel compute function. So far I've tried to put #pragma unroll(num_times) before a for loop, but the compiler ignores that statement. It...
Antons asked 20/12, 2016 at 19:22

1

Solved

This question is in part a follow up question to GCC 5.1 Loop unrolling. According to the GCC documentation, and as stated in my answer to the above question, flags such as -funroll-loops turn on ...
Vachil asked 13/9, 2016 at 20:4

1

Solved

Given the following code #include <stdio.h> int main(int argc, char **argv) { int k = 0; for( k = 0; k < 20; ++k ) { printf( "%d\n", k ) ; } } Using GCC 5.1 or later with -x c ...
Tycoon asked 22/6, 2016 at 12:0

2

Solved

I have this kind of Duff's device in C and it works fine (format text as money): #include <stdio.h> #include <string.h> char *money(const char *src, char *dst) { const char *p = src;...
Vanden asked 3/5, 2016 at 22:44

2

Solved

Is there a way to instruct GCC (I'm using 4.8.4) to unroll the while loop in the bottom function completely, i.e., peel this loop? The number of iterations of the loop is known at compilation time:...
Frescobaldi asked 20/3, 2016 at 5:36

3

I wanted to benchmark the difference in execution speed between an unrolled loop and a for loop applied on a triangle object. The entire example is available here. Here is the complete code: #in...
Goldsworthy asked 28/11, 2014 at 16:23

1

For the following loop GCC will only vectorize the loop if I tell it to use associative math e.g. with -Ofast. float sumf(float *x) { x = (float*)__builtin_assume_aligned(x, 64); float sum = 0; ...
Essentiality asked 9/10, 2015 at 12:38

4

Solved

I was reading the optimization options for GCC when I found the option -funroll-all-loops. Its description reads: Unroll all loops, even if their number of iterations is uncertain when the loo...
Should asked 1/7, 2015 at 16:37

2

Solved

The introductory links I found while searching: 6.59.14 Loop-Specific Pragmas 2.100 Pragma Loop_Optimize How to give hint to gcc about loop count Tell gcc to specifically unroll a loop How to For...
Guideboard asked 27/3, 2015 at 3:29

5

Solved

I am currently working on a project, where every cycle counts. While profiling my application I discovered that the overhead of some inner loop is quite high, because they consist of just a few mac...
Benjaminbenji asked 30/1, 2015 at 8:14

4

Solved

Consider the following code vector<double> v; // fill v const vector<double>::iterator end =v.end(); for(vector<double>::iterator i = v.bgin(); i != end; ++i) { // do stuff ...
Sheugh asked 17/7, 2012 at 18:54

2

Solved

How do I convince GCC to unroll a loop where the number of iterations is known, but large? I'm compiling with -O3. The real code in question is more complex, of course, but here's a boiled-down e...
Bah asked 16/9, 2014 at 21:35

1

Solved

I'm new to CUDA, and I can't understand loop unrolling. I've written a piece of code to understand the technique __global__ void kernel(float *b, int size) { int tid = blockDim.x * blockIdx.x + t...
Arbour asked 9/3, 2014 at 5:5

© 2022 - 2024 — McMap. All rights reserved.