How to decrease SPI overhead time for STM32L4 HAL library

Asked 14/10, 2018 at 18:17 Answered 11/10, 2020 at 0:39

I am using a STM32L476RG board and HAL SPI functions:

HAL_SPI_Transmit(&hspi2, &ReadAddr, 1, HAL_MAX_DELAY);
HAL_SPI_Receive(&hspi2, pBuffer, 4, HAL_MAX_DELAY);

I need to receive data from accelerometer's buffer with maximum speed and I have a problem with delay in these functions. As you can see on the oscilloscope screenshots, there are several microseconds during which nothing happens. I have no idea how to minimize the transmission gap.

I tried using HAL_SPI_Receive_DMA function and this delay was even bigger. Do you have any idea how to solve this problem using HAL functions or any pointers on how I could write my SPI function without these delays?

Lakesha answered 14/10, 2018 at 18:17 Comment(1)

Stop using HAL. – Trinitroglycerin 14/10, 2018 at 18:40

TL;DR Don't use HAL, write your transfer functions using the Reference Manual.

HAL is hopelessly overcomplicated for time-critical tasks (among others). Just look at the HAL_SPI_Transmit() function, it's over 60 lines of code till it gets to actually touching the Data Register. HAL will first mark the port access structure as busy even when there is no multitasking OS in sight, validates the function parameters, stores them in the hspi structure for no apparent reason, then goes on figuring out what mode SPI is in, etc. It's not necessary to check timeouts in SPI master mode either, because master controls all bus timings, if it can't get out a byte in a finite amount of time, then the port initialization is wrong, period.

Without HAL, it's a lot simpler. First, figure out what should go into the control registers, set CR1 and CR2 accordingly.

void SPIx_Init() {
    /* full duplex master, 8 bit transfer, default phase and polarity */
    SPIx->CR1 = SPI_CR1_MSTR | SPI_CR1_SPE | SPI_CR1_SSM | SPI_CR1_SSI;
    /* Disable receive FIFO, it'd complicate things when there is an odd number of bytes to transfer */
    SPIx->CR2 = SPI_CR2_FRXTH;
}

This initialization assumes that Slave Select (NSS or CS#) is handled by separate GPIO pins. If you want CS# managed by the SPI peripheral, then look up Slave select (NSS) pin management in the Reference Manual.

Note that a full duplex SPI connection can not just transmit or receive, it always does both simultaneously. If the slave expects one command byte, and answers with four bytes of data, that's a 5-byte transfer, the slave will ignore the last 4 bytes, the master should ignore the first one.

A very simple transfer function would be

void SPIx_Transfer(uint8_t *outp, uint8_t *inp, int count) {
    while(count--) {
        while(!(SPIx->SR & SPI_SR_TXE))
            ;
        *(volatile uint8_t *)&SPIx->DR = *outp++;
        while(!(SPIx->SR & SPI_SR_RXNE))
            ;
        *inp++ = *(volatile uint8_t *)&SPIx->DR;
    }
}

It can be further optimized when needed, by making use of the SPI fifo, interleaving writes and reads so that the transmitter is always kept busy.

If speed is critical, don't use generalized functions, or make sure they can be inlined when you do. Use a compiler with link-time optimization enabled, and optimize for speed (quite obviously).

Beaker answered 14/10, 2018 at 20:15 Comment(7)

You can use the Low Layers API instead of HAL. There is a lot less of boiler plate code. – Microcopy 14/10, 2018 at 20:26

Whose functions are not very suitable for the SPI. SPI at higher speeds need the DMA do be efficient and there is no other way. – Trinitroglycerin 14/10, 2018 at 20:49

Hi, many thanks for the tips! I have not much experience in embedded programming. I will try to write this code myself, but it could significantly speed up my work if anyone share a ready to use code example. Could you point me an example with LL API or DMA? – Lakesha 15/10, 2018 at 6:2

Look at the example folder in the STM Library folder. It provides examples for both HAL and LL implementations. – Shadow 15/10, 2018 at 6:15

DMA can speed up the transfer of long SPI transactions. Your transaction is only five bytes. I doubt that you can gain anything with DMA transfers. – Gony 15/10, 2018 at 8:1

@berendi thank you for your help, SPIx_Transfer function works great. Do you have an idea how to implement a similar function using DMA? – Lakesha 17/10, 2018 at 16:3

@Lakesha Look up the stream and channel numbers (for both transmit and receive) in the DMA request mapping table, set up the DMA channel registers except CCR, then follow the description in the SPI functional description / Data transmission and reception procedure / Communication using DMA section of the Reference Manual. – Beaker 17/10, 2018 at 20:50

You can use HAL_SPI_TransmitReceive(&hspi2, ReadAddr, pBuffer, 1 + 4, HAL_MAX_DELAY); instead of a HAL_SPI_Transmit and a HAL_SPI_Receive. This will avoid the time between transmit and receive. You can also try changing compilation settings to optimize the speed. You can also check the accelerometer's datasheet, may be you can read all the buffer with a single frame, something lie this: HAL_SPI_TransmitReceive(&hspi2, ReadAddr, pBuffer, 1 + (4 * numOfSamples), HAL_MAX_DELAY);

Lycanthropy answered 15/10, 2018 at 7:10 Comment(2)

Thank you for your advice, I tried using HAL_SPI_TransmitReceive(&hspi2, ReadAddr, pBuffer, 1 + 4, HAL_MAX_DELAY); and it works a little faster. Unfortunately, it’s still too slow. – Lakesha 16/10, 2018 at 17:42

@Lakesha You can change the compilation settings to optmize the speed (-03). If the accelerometer has a FIFO you can use and read N samples at the same time, or you can change the HAL code to optmize it. – Lycanthropy 17/10, 2018 at 15:49

What worked for me:

Read SPI registers directly
Optimize your function for speed

For example function (code); See solution by “JElli.1” in ST- Community >> ST Community answer

Datum answered 11/10, 2020 at 0:39 Comment(0)

Recommended topics

Hot tags