blas Questions

2

Why does t(mat1) %*% mat2 work quicker than crossprod(mat1, mat2). Isn't the whole point of the latter that it calls a more efficient low-level routine? r$> mat1 <- array(rnorm(100 * 600), di...
Jessejessee asked 3/10 at 5:24

6

Using an alternative BLAS for R has several advantages, see e.g. https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf. Microsoft R Open https://mran.revolutionanalytics.com/documents/rr...
Zitella asked 29/6, 2016 at 4:8

2

Solved

So my code wants to include different header files when occurs to different BLAS/LAPACK vendors. Are there any predefined macros or something like that make me check it?
Dormer asked 4/6, 2011 at 18:47

2

According to MKL BLAS documentation "All matrix-matrix operations (level 3) are threaded for both dense and sparse BLAS." http://software.intel.com/en-us/articles/parallelism-in-the-intel-math-kern...

1

I need to run a multi-threaded matrix-vector multiplication every 500 microseconds. The matrix is the same, the vector changes every time. I use Intels sgemv() in the MKL on a 64-core AMD CPU. If I...
Professional asked 23/2, 2023 at 18:7

2

Solved

I would like to implement a parallel matrix-vector multiplication for a fixed size matrix (~3500x3500 floats) optimized for my CPUs and cache layout (AMD Zen 2/4) that is repeatedly executed for ch...

4

I have allocated a big double vector, lets say with 100000 element. At some point in my code, I want to set all elements to a constant, nonzero value. How can I do this without using a for loop ove...
Dime asked 10/3, 2011 at 13:37

4

I have been running some code in R and while testing realized the results were different on Windows and Linux. I have tried to understand why this happens, but couldn't find an answer. Let's illust...
Wallasey asked 13/2, 2023 at 0:52

20

When I'm trying to use TensorFlow with Keras using the gpu, I'm getting this error message: C:\Users\nicol\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\__main__.py:2: UserWarning: Update ...
Chaparro asked 15/5, 2017 at 22:59

1

Solved

I noticed that evaluating matrix operations in quadratic form from right to left is significantly faster than left to right in R, depending on how the parentheses are placed. Obviously they both pe...
Dael asked 13/10, 2022 at 20:28

2

Solved

my computer has only 1 GPU. Below is what I get the result by entering someone's code [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality {} incarnation: ...
Attenuant asked 1/10, 2021 at 6:14

3

Solved

Is it possible to run armadillos calculations using GPU? Is there any way to use the GPU blas libraries (for example cuBLAS) with armadillo? Just a note, I am totally new to GPU programming.
Barrio asked 1/8, 2013 at 1:26

5

Solved

I would like to write a program that makes extensive use of BLAS and LAPACK linear algebra functionalities. Since performance is an issue I did some benchmarking and would like know, if the approac...
Molokai asked 29/9, 2011 at 11:23

2

Solved

Is there an efficient way to calculate a bitwise sum of uint8_t buffers (assume number of buffers are <= 255, so that we can make the sum uint8)? Basically I want to know how many bits are set a...
Offence asked 7/10, 2021 at 15:53

3

Solved

I have a working LAPACK implementation and that, as far as I read, contains BLAS. I want to use SPARSE BLAS and as far as I understand this website, SPARSE BLAS is part of BLAS. But when I tried...
Hedonism asked 17/10, 2015 at 18:24

2

Solved

I have a large computational problem I am working on. To decrease the computation speed of a set of linear equations in a square matrix, I have made use of lapack and blas. To get the libraries on ...
Toxicogenic asked 26/8, 2020 at 15:20

3

Solved

I have to calculate some products in the form A'A or more general A'DA, where A is a general mxn matrix and D is a diagonal mxm matrix. Both of them are full rank; i.e.rank(A)=min(m,n). I know tha...
Rhyme asked 30/10, 2017 at 10:58

5

Solved

I am builing my numpy/scipy environment based on blas and lapack more or less based on this walk through. When I am done, how can I check, that my numpy/scipy functions really do use the previous...
Abele asked 25/1, 2012 at 9:15

3

I'm curious why Julias implementation of matrix addition appears to make a copy. Heres an example: foo1=rand(1000,1000) foo2=rand(1000,1000) foo3=rand(1000,1000) julia> @time foo1=foo2+foo3; ...
Aristotle asked 17/2, 2016 at 19:19

4

Problem: Linking numpy to correct Linear Algebra libraries. Process is so complicated that I might be looking for the solution 6th time and I have no idea whats going wrong. I am on Ubuntu 12.04.5....
Bickering asked 13/11, 2015 at 2:17

4

Solved

How large a system is it reasonable to attempt to do a linear regression on? Specifically: I have a system with ~300K sample points and ~1200 linear terms. Is this computationally feasible?
U asked 23/12, 2009 at 20:22

16

Solved

When I run sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) I get InternalError: Blas SGEMM launch failed. Here is the full error and stack trace: InternalErrorTraceback (most recent ca...
Diena asked 20/5, 2016 at 4:0

4

Solved

Is there a way of detecting the version of BLAS that R is using from inside R? I am using Ubuntu, and I have a couple of BLAS versions installed - I just don't know which one is "active" from R's p...
Kettering asked 12/3, 2012 at 10:30

0

In various attempts to reduce the computing time of an algorithm I have been coding in the last few days, I wanted to test the effective improvement given by crossprod on the %*%. I surprisingly no...
Fabri asked 6/11, 2019 at 0:19

1

Based on the famous check_blas.py script, I wrote this one to check that theano can in fact use multiple cores: import os os.environ['MKL_NUM_THREADS'] = '8' os.environ['GOTO_NUM_THREADS'] = '8'...
Mcreynolds asked 28/4, 2016 at 8:15

© 2022 - 2024 — McMap. All rights reserved.