numa Questions

5

I am working on a Java application for solving a class of numerical optimization problems - large-scale linear programming problems to be more precise. A single problem can be split up into smaller...
Retrospection asked 14/11, 2019 at 20:29

1

Solved

In the technical overview published by Intel, "Sub-NUMA Clustering" and "Hemisphere and Quadrant Modes" are described separately. But the main difference between them is not cle...
Sciamachy asked 28/4, 2023 at 8:51

1

Solved

This question is a spin-off of the one posted here: Measuring bandwidth on a ccNUMA system I've written a micro-benchmark for the memory bandwidth on a ccNUMA system with 2x Intel(R) Xeon(R) Platin...

1

Solved

I'm attempting to benchmark the memory bandwidth on a ccNUMA system with 2x Intel(R) Xeon(R) Platinum 8168: 24 cores @ 2.70 GHz, L1 cache 32 kB, L2 cache 1 MB and L3 cache 33 MB. As a reference, ...
Seeder asked 10/5, 2022 at 7:55

3

Solved

I'm attempting to create a std::vector<std::set<int>> with one set for each NUMA-node, containing the thread-ids obtained using omp_get_thread_num(). Topo: Idea: Create data which is ...
Interested asked 3/3, 2022 at 16:50

4

Solved

I've set up my code to carefully load and process data locally on my NUMA system. I think. That is, for debugging purposes I'd really like to be able to use the pointer addresses being accessed ins...
Endothelioma asked 2/11, 2011 at 20:27

1

Solved

I'm building a topological tree of sockets, NUMA nodes, caches, cores, and threads for any Intel or AMD system in C. Building this hierarchy, I want to ensure hardware threads are grouped together ...
Glomerulonephritis asked 1/9, 2021 at 21:19

3

I have an Intel Xeon Phi 64-core CPU with 16GB on-chip memory set as NUMA node 1. I want to bind a process running inside a Docker container to this NUMA node, but it errors out: root@Docker$ sudo...
Perth asked 6/4, 2017 at 23:44

3

Linux can have both standard 4KiB page memory and 1GiB (huge) paged memory (and 2MiB pages, but I don't know if anyone uses that). Is there a standard call to get the page size from an arbitrary vi...
Pampa asked 5/4, 2021 at 6:52

5

Solved

Hopping from Java Garbage Collection, I came across JVM settings for NUMA. Curiously I wanted to check if my CentOS server has NUMA capabilities or not. Is there a *ix command or utility that could...
Babi asked 20/6, 2012 at 18:42

2

The MPI-3 standard introduces shared-memory, that can be read and written by all processes sharing this memory without using calls to the MPI library. While there are examples of one-sided communic...
Bullroarer asked 19/2, 2020 at 10:33

1

Solved

This question is for: kernel 3.10.0-1062.4.3.el7.x86_64 non transparent hugepages allocated via boot parameters and might or might not be mapped to a file (e.g. mounted hugepages) x86_64 Accord...
Unrepair asked 14/1, 2020 at 1:8

1

Solved

#include <cstdint> #include <iostream> #include <numaif.h> #include <sys/mman.h> #include <fcntl.h> #include <errno.h> #include <unistd.h> #include <str...
Vergeboard asked 6/2, 2019 at 3:53

1

Solved

TL;DR How are MMIO, IO and PCI configuration requests routed to the right node in a NUMA system? Each node has a "routing table" but I'm under the impression that the OS is supposed to be...
Resurge asked 30/7, 2019 at 18:31

5

We've just bought a 32-core Opteron machine, and the speedups we get are a little disappointing: beyond about 24 threads we see no speedup at all (actually gets slower overall) and after about 6 th...
Mullion asked 20/11, 2012 at 1:45

1

Solved

I see that g++ generates a simple mov for x.load() and mov+mfence for x.store(y). Consider this classic example: #include<atomic> #include<thread> std::atomic<bool> x,y; bool r1...
Mydriatic asked 12/2, 2019 at 14:46

0

Using mbind, one can set the memory policy for a given mapped memory segment. Q: How can I tell mbind to interleave a segment on all nodes? If done after allocation but before usage, MPOL_INTERLEAV...
Vermin asked 18/11, 2018 at 0:12

2

Solved

I have a dual socket Xeon E5522 2.26GHZ machine (with hyperthreading disabled) running ubuntu server on linux kernel 3.0 supporting NUMA. The architecture layout is 4 physical cores per socket. An ...
Kape asked 14/8, 2012 at 20:4

1

Solved

I have Jetson TX2, python 2.7, Tensorflow 1.5, CUDA 9.0 Tensorflow seems to be working but everytime, I run the program, I get this warning: with tf.Session() as sess: print (sess.run(y,feed_dict)...
Giff asked 7/8, 2018 at 18:18

2

I'm trying to understand what node distances in numactl --hardware mean? On our cluster, it outputs the following numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 12 13 14 15 ...
Jarrett asked 30/10, 2017 at 8:0

1

Solved

I'm working on a legacy application initially developed for multicore processor systems. To leverage multicore processing OpenMP and PPL have been used. Now a new requirement is to run the software...
Hellhole asked 5/3, 2018 at 7:31

1

Recently I have been observing performance effects in memory-intensive workloads I was unable to explain. Trying to get to the bottom of this I started running several microbenchmarks in order to d...
Comedian asked 11/12, 2017 at 9:36

1

Solved

Consider this scenario: a user process running on a NUMA machine calls mmap to creates a new mapping in the virtual address space. It then uses the memory returned by mmap for its processing (stori...
Chokeberry asked 3/11, 2017 at 13:40

0

I am developing a real-time application on a server with two NUMA nodes. Below is a simplified version of the system diagram (the OS is Ubuntu14.04): .-------------. .-------------. | Device 0 | |...
Selvage asked 4/8, 2017 at 9:29

0

I built tensorflow from sources using bazel and when I finally open a session, I get the following warning: 2017-05-07 15:45:40.816127: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893]...
Ecosphere asked 7/5, 2017 at 10:22

© 2022 - 2024 — McMap. All rights reserved.