STM32 bare metal USB implementation
Asked Answered
S

1

6

UPDATE For anyone interested, here is a step-by-step instruction and explanation on how to build a bare metal USB-Stack, how to tackle such a project and what you need to know for each step: STM32USB@GitHub

TLDR: I have a STM32G441 and want to implement a USB driver without the use of any HAL Libraries, just using CMSIS - for learning experience, for space and because what I want to do would require to change the hal anyway.

But I can't get this thing to receive anything. I'm stuck trying to get the Device Address, which is never handed to the code. The hal middleware works just fine, so it's not a HW issue.

What I'm doing

I'm enabling the USB clock (correctly as I assume, because it can send ACK signals using my Logic Analyzer), power up the USB peripheral as defined in the datasheet, enable all the necessary Interrupts and handle the reset event by initializing the BTable and Endpoint 0. Now I expect to receive a CTR-Interrupt which never appears.

Reference Manual

Clock

The μC runs on a 25MHz HSE clock. The USB periphery runs on the PLL Q clock at ~48MHz, RCC settings were verified with the CubeMX clock configurator. AHB runs at half speed, because I get a bus error hard fault if I try to run it at full speed, but that's another question. The System Clock is set to 143.75MHz.

RCC->CR |= RCC_CR_HSEON | RCC_CR_HSION;

// Configure PLL (R=143.75, Q=47.92)
RCC->CR &= ~RCC_CR_PLLON;
while (RCC->CR & RCC_CR_PLLRDY) {
}
RCC->PLLCFGR |= RCC_PLLCFGR_PLLSRC_HSE | RCC_PLLCFGR_PLLM_0 | (23 << RCC_PLLCFGR_PLLN_Pos) | RCC_PLLCFGR_PLLQ_1;
RCC->PLLCFGR |= RCC_PLLCFGR_PLLREN | RCC_PLLCFGR_PLLQEN;
RCC->CR |= RCC_CR_PLLON;

// Select PLL as main clock, AHB/2 > otherwise Bus Error Hard Fault
RCC->CFGR |= RCC_CFGR_HPRE_3 | RCC_CFGR_SW_PLL;

// Select & Enable IO Clocks (PLL > USB, ADC; HSI16 > UART)
RCC->CCIPR = RCC_CCIPR_CLK48SEL_0 | RCC_CCIPR_ADC12SEL_1 | RCC_CCIPR_USART1SEL_1 | RCC_CCIPR_USART2SEL_1 | RCC_CCIPR_USART3SEL_1 | RCC_CCIPR_UART4SEL_1;
RCC->AHB2ENR |= RCC_AHB2ENR_ADC12EN | RCC_AHB2ENR_GPIOAEN | RCC_AHB2ENR_GPIOBEN | RCC_AHB2ENR_GPIOCEN;
RCC->APB1ENR1 |= RCC_APB1ENR1_USBEN | RCC_APB1ENR1_UART4EN | RCC_APB1ENR1_USART3EN | RCC_APB1ENR1_USART2EN;
RCC->APB2ENR |= RCC_APB2ENR_USART1EN;

// Enable DMAMUX & DMA1 Clock
RCC->AHB1ENR |= RCC_AHB1ENR_DMAMUX1EN | RCC_AHB1ENR_DMA1EN;

USB Memory

As far as I know, the USB BTable and endpoint buffers need to be placed in the USB-SRAM, not in regular SRAM. I've added some linker directives to create a section for that, which seems to work just fine according to the memory analyzer. Mem2Usb just recalculates the offset from absolute to relative to the USB-SRAM offset.

#define __USB_MEM __attribute__((section(".usbbuf")))
#define __USBBUF_BEGIN 0x40006000
#define __MEM2USB(X) (((int)X - __USBBUF_BEGIN))

First question: The access is only allowed to be 16 Bytes wide. But, contrary to e.g. STM32F103 there is no need for padding as it seems. The memory tool has some problems displaying this region, because it is only handling WORD access while the tool uses DWORD access, but copying the memory allocated by the HAL word by word also shows no padding. Is that correct? So I should be able to use all 1024 bytes, not just seeing them but only having 512. This is also the reason why mem2usb does not divide the address by 2.

Then I create some structures for the BTable and the zero-endpoint. The BTable ends up at 0x40006000 by default. Endpoint 0 has a rx and a tx buffer with max 64 bytes as per USB spec. The alignments are taken from the Reference manual. The memory is not automatically zeroed out.

typedef struct {
    unsigned short ADDR_TX;
    unsigned short COUNT_TX;
    unsigned short ADDR_RX;
    unsigned short COUNT_RX;
} USB_BTABLE_ENTRY;

__ALIGNED(8)
__USB_MEM
static USB_BTABLE_ENTRY BTable[8] = {0};

__ALIGNED(2)
__USB_MEM
static char EP0_Buf[2][64] = {0};

Initialization

Enabling the NVIC, then power up, wait 1μs until clock is stable as per datasheet, then clear reset state, clear pending interrupts, enable interrupts and last enable the internal pull up to start enumeration.

NVIC_SetPriority(USB_HP_IRQn, 0);
NVIC_SetPriority(USB_LP_IRQn, 0);
NVIC_SetPriority(USBWakeUp_IRQn, 0);
NVIC_EnableIRQ(USB_HP_IRQn);
NVIC_EnableIRQ(USB_LP_IRQn);
NVIC_EnableIRQ(USBWakeUp_IRQn);

USB->CNTR &= ~USB_CNTR_PDWN;

// Wait 1μs until clock is stable
SysTick->LOAD = 100;
SysTick->VAL = 0;
SysTick->CTRL = 1;
while ((SysTick->CTRL & SysTick_CTRL_COUNTFLAG_Msk) == 0) {
}
SysTick->CTRL = 0;

USB->CNTR &= ~USB_CNTR_FRES;
USB->ISTR = 0;

USB->CNTR |= USB_CNTR_RESETM | USB_CNTR_CTRM | USB_CNTR_WKUPM | USB_CNTR_SUSPM | USB_CNTR_ESOFM;
USB->BCDR |= USB_BCDR_DPPU;

USB Reset

Now the host sends a reset signal, which is triggered correctly. During the reset signal, I initialize the BTable and EP0. I set EP0 to ACK on RX and NACK on TX requests, as do other bare metal USB examples and the HAL (they are toggle, not write, but the register is in a known state of 0x00 as the hardware resets them on a reset). Lastly I put the USB peripheral in enable mode and reset the device address to 0.

if ((USB->ISTR & USB_ISTR_RESET) != 0) {
    USB->ISTR = ~USB_ISTR_RESET;

    // Enable EP0
    USB->BTABLE = __MEM2USB(BTable);

    BTable[0].ADDR_TX = __MEM2USB(EP0_Buf[0]);
    BTable[0].COUNT_TX = 0;
    BTable[0].ADDR_RX = __MEM2USB(EP0_Buf[1]);
    BTable[0].COUNT_RX = (1 << 15) | (1 << 10);

    USB->EP0R = USB_EP_CONTROL | (2 << 4) | (3 << 12);
    USB->CNTR = USB_CNTR_CTRM | USB_CNTR_RESETM;

    USB->DADDR = USB_DADDR_EF;
}

Debugging shows that the BTable is indeed at 0x40006000 and the Buffer address is written (I assume) correctly. The EP0 register was compared to a working HAL implementation and they are the same at that point.

Here I'm stuck

I expect the host to send the device address next (it doesn't, it sends a sleep and a wakeup and then another reset first), which will trigger the CRT interrupt (which is masked). Point is, it never does. And I don't know why. The host sends the request just fine and the device sends an ACK on that request just fine (logic analyzer), but the CRT is never triggered. Any ideas what else I can try or where to look?

Update

I've now compared the messages from my implementation with the HAL ones. The interrupt now handles the exact same messages in the exact same order and the USB-Registers also contain exactly the same values for every request. I've changed the BTable and USB-SRAM layout to contain the exact same values as the HAL after the Reset-Interrupt.

I had to implement the SUSP and WKUP for this to work, which was probably one of the things thats missing. Now they both behave exactly the same. It turns out, the problem is that I never receive a proper SOF-Package. The HAL gets its first SOF directly after the second reset (HW-Reset > 2x ESOF > SUSP > WKUP > RESET > (Optional 1 ESOF) > SOF), while mine gets an ERR instead of the SOF.

Looks like the error is not related to the USB registers or USB-SRAM. Next step will be to compare all registers I can think of as relevant between the two implementations. Maybe I forgot a clock?

Saltigrade answered 31/8, 2022 at 6:56 Comment(7)
I would try with the HAL and then make my way through the source code of it, replacing its functions with my own. During the course I'll learn how to do it, and how not to do it. I will find errors in the HAL, most probably, too. When only my functions are left, I can dive into refactoring until I know exactly in depth how it works.Preheat
What is the CRT interrupt? Did you make a typo and actually mean the CTR (Correct Transfer) bit in the USB interrupt status register (USB_ISTR)?Chekiang
I would double check the way you are setting up USB->EP0R when the USB reset signal is received. I don't understand why you are setting/toggling the bits in that register the way you are. You could use a debugger to read the exact value of that register when the working HAL code sets it, and do the same for your code and compare the two values (they should probably be equal if you want your code to work). Also note that some bits in that register can only be toggled; they can't be directly set to 0 or 1.Chekiang
@DavidGrayson Yup that's the one ^^' In the meantime I checked and compared all USB registers with the HAL and made them contain exactly the same info at the same steps in time. Some new intel, but it's still not working even though the requests are now served exactly the same. I've added my observations to the question.Saltigrade
It was the frickin clock, dammit!Saltigrade
The hard faults you are experiencing at full AHB speed is probably because you didn't configure the number of wait states of the flash memory. The core starts reading and executing garbage instructions from it.Electrophysiology
@Electrophysiology Jup. Since asking the question I already fixed the bug and it was a combination of not setting the wait states and doing the correct speed transition on the AHB, because it turns out you need to first transition to full sysclock with AHB/2 before going full AHB. As it was written if you read the docs carefully ^^'Saltigrade
S
10

Spend almost a week. Just to figure out I misconfigured my 48MHz clock source...

RCC->CCIPR = RCC_CCIPR_CLK48SEL_0 | ...

This sets the CLK48SEL to Reserved (01), not the PLLQ-Clock (10)...

RCC->CCIPR = RCC_CCIPR_CLK48SEL_1 | ...

Now I get the SOF packages and the CTR alright. May that question serve as a USB bare metal reference in the future.

Saltigrade answered 31/8, 2022 at 12:44 Comment(3)
Maybe my USB CDC (emulation of PL2303) would be usefull for you.Eudemonism
@Eudemonism Thanks, I'm already past CDC now and into Ethernet card emulation :)Saltigrade
This is some nice work.Toy

© 2022 - 2024 — McMap. All rights reserved.