x86-64 Questions
1
Solved
If we look at a few modern calling conventions, like x86-64 SysV style or AArch64 style (document aapcs64.pdf titled "Procedure Call Standard for the Arm® 64-bit Architecture"), we see ex...
Stentorian asked 23/10 at 10:57
1
Solved
I have populated a zmm register with an array of byte integers from 0-63. The numbers serve as indices into a matrix. Non-zero elements represent rows in the matrix that contain data. Not all rows ...
2
Solved
I'm studying x86-64 NASM and here is current situation:
These codes are for education only, not for running on client-facing system or so.
RCX holds loop count, between 1 and 1000.
At the beginnin...
Improbability asked 8/9 at 15:50
1
I was playing around with Compiler Explorer and ran into an anomaly (I think). If I want to make the compiler vectorize a sin calculation using libmvec, I would write:
#include <cmath>
#def...
Eugine asked 20/9, 2016 at 9:54
3
Solved
I want to get the address of _GLOBAL_OFFSET_TABLE_ in my program. One way is to use the nm command in Linux, maybe redirect the output to a file and parse that file to get address of _GLOBAL_OFFSET...
2
Solved
I need to build OpenSSL on OS X for 32 and 64 bit architectures. What are the options I need to give to ./Configure so that I get it built for both architectures into same .a file?
4
Solved
At some point in my program I compute an integer divisor d. From that point onward d is going to be constant.
Later in the code I will divide by that d several times - performing an integer divisi...
Blaise asked 27/7, 2017 at 14:23
1
So someone on a forum asked why this C function (which I added const and restrict to, just in case):
void foo(int *const restrict dest, const int *const restrict source) {
*dest = (*source != -1) ...
Lange asked 14/6 at 13:3
7
Solved
While running a program I've written in assembly, I get Illegal instruction error. Is there a way to know which instruction is causing the error, without debugging that is, because the machine I'm ...
0
How can I use the LD_PRELOAD trick on Windows to circumvent MKL performance degradation on AMD CPUs?
How can I use the LD_PRELOAD trick on Windows to circumvent MKL performance degradation on AMD CPUs?
The documentation linked here explains that the LD_PRELOAD trick can be used to force MKL to use...
Epicene asked 14/5 at 23:27
5
Solved
GCC and Clang both compile
bool pred();
void f();
void g();
void h() {
if (pred()) {
f();
} else {
g();
}
}
to some variation of
# Clang -Os output. -O3 is the same
h():
push rax
call pred...
Parhelion asked 24/4 at 18:46
0
2
I often forget the registers that I need to use for each argument in a syscall, and everytime I forget I just visit this question.
The right order for integer/pointer args to x86_64 user-space func...
Balloon asked 14/9, 2020 at 21:21
1
Solved
I've stumbled across an oddity in MSVCs codegen, regarding structures that are used as return-values. Consider the following code (live demo here):
struct Result
{
uint64_t value;
};
Result makeR...
Crapulous asked 28/3 at 14:56
4
I am trying to compile and link my first program on Assembler.
I try to compile the following code:
; %include "stud_io.inc"
global _main
section .text
_main:
xor eax, eax
again:
; PRINT "He...
2
When compiling below code:
global main
extern printf, scanf
section .data
msg: db "Enter a number: ",10,0
format:db "%d",0
section .bss
number resb 4
section .text
main:
mov rdi, msg
mov a...
Weidman asked 27/6, 2018 at 20:13
1
I am wondering why this code:
size_t hash_word(const char* c, size_t size) {
size_t hash = uchar(c[0]);
hash ^= uchar(c[size - 1]);
hash ^= uchar(c[size - 2]);
return hash;
}
When compiled:
m...
Enfeeble asked 5/2 at 2:21
2
I recently came across this post describing the smallest possible ELF executable for Linux, however the post was written for 32 bit and I was unable to get the final version to compile on my machin...
Akeyla asked 19/11, 2018 at 21:3
2
Solved
From here, I learned that the support of AVX doesn't imply the support of BMI1. So how about AVX2: Do all CPUs that support AVX2 also support BMI2? Further, does the support of AVX2 imply the suppo...
1
Solved
I've tried to write a few functions to carry out matrix-vector multiplication using a single matrix together with an array of source vectors. I've once written those functions in C++ and once in x8...
Ermeena asked 21/1 at 0:13
0
I'm just starting to learn C moving from C++, and I was just trying out a ton of variable types. I am using the MingGW-w64 toolset with the GCC compiler. This version supposedly uses UCRT runtime i...
Allomerism asked 12/1 at 13:8
2
I have a constant (64-bit) address that I want to load into a register. This address is located in the code, segment, so it could be addressed relative to RIP. What's the differences between
movabs...
Disinterest asked 5/1 at 15:0
6
Solved
1
I would like to implement the following function using SSE. It blends elements from a with packed elements from b, where elements are only present if they are used.
void packedBlend16(uint8_t mask...
0
Note: this question is about CPU instructions, not high-level languages (where you are at the mercy of the compiler)
From a popular answer:
The same floating-point operations, run on the same har...
Dunkirk asked 13/11, 2023 at 19:25
1 Next >
© 2022 - 2024 — McMap. All rights reserved.