Bare Metal STM32 · Part 4 of 7
SPI DMA Transfers: No CPU Overhead
33:05
Part 4 / 7
⚙️ Bare Metal STM32 Series · 7 Parts
Part 4 of 7
Part 01
HAL vs Bare Metal: Which Should You Use?
38:22
Part 02
Clock Configuration with RCC: From HSI to PLL
22:10
Part 03
Interrupts & NVIC: Complete Configuration Guide
47:22
Part 04
SPI DMA Transfers: No CPU Overhead
33:05
Part 05
ADC DMA Circular Mode: Continuous Sampling
29:45
Part 06
Debugging with SWO Trace & ITM Printf
18:30
Part 07
Low-Power Stop Mode with Wake-Up on EXTI
34:12
01
Overview
Polling SPI burns CPU cycles waiting for the TX buffer to empty — fine for a few bytes, catastrophic for a display frame or sensor burst. DMA transfers hand the job to dedicated hardware: the CPU starts the transfer, does other work, and gets an interrupt when it's done. This part sets up SPI1 + DMA2 from scratch to send a 256-byte frame at 42 MHz with zero CPU involvement during the transfer.
- Configure SPI1 as master: clock polarity, phase, baud rate, data size
- Set up DMA2 Stream 3 Channel 3 (SPI1_TX) with memory-to-peripheral mode
- Handle the transfer-complete interrupt and drive CS manually for frame accuracy
- Understand why you must wait for BSY=0 before deasserting CS
02
DMA Concepts
Streams & Channels
STM32F4 has DMA1 and DMA2, each with 8 streams. Each stream has 8 channel options. Which stream/channel routes to which peripheral is fixed in the datasheet — always check the DMA request mapping table.
DMA2 Stream3 Ch3 = SPI1_TX
Transfer Modes
Normal mode: DMA stops after NDTR transfers and fires a TC interrupt. Circular mode: DMA auto-reloads NDTR and fires HT (half) and TC (full) interrupts continuously — used in Part 5 for ADC.
CIRC bit in DMA_SxCR
FIFO & Burst
The DMA FIFO buffers data between bus transactions. With FIFO enabled, bursts of 4/8/16 beats can saturate the AHB bus more efficiently. Direct mode (FIFO off) is simpler and works for most use cases.
DMA_SxFCR: DMDIS, FTH
03
Code
01
SPI1 + DMA2 init — master, 8-bit, CPOL=0 CPHA=0spi_dma.c
C
void SPI1_DMA_Init(void) { /* Clocks */ RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN | RCC_AHB1ENR_DMA2EN; RCC->APB2ENR |= RCC_APB2ENR_SPI1EN; /* PA5=SCK, PA6=MISO, PA7=MOSI → AF5 */ GPIOA->MODER |= (0xAA << 10); // Alternate function GPIOA->OSPEEDR|= (0xFF << 10); // High speed GPIOA->AFR[0] |= (0x555 << 20); // AF5 for PA5/6/7 /* SPI1: Master, 8-bit, fPCLK/4 (21 MHz), CPOL=0, CPHA=0, SSM */ SPI1->CR1 = SPI_CR1_MSTR // Master mode | SPI_CR1_BR_0 // Baud = fPCLK/4 | SPI_CR1_SSM // Software slave management | SPI_CR1_SSI; // Internal slave select high SPI1->CR2 = SPI_CR2_TXDMAEN; // Enable TX DMA request /* DMA2 Stream3 Channel3 — SPI1_TX (see RM Table 43) */ DMA2_Stream3->CR = (3 << DMA_SxCR_CHSEL_Pos) // Channel 3 | DMA_SxCR_DIR_0 // Mem → Peripheral | DMA_SxCR_MINC // Memory auto-increment | DMA_SxCR_TCIE; // Transfer-complete IRQ DMA2_Stream3->PAR = (uint32_t)&SPI1->DR; // Peripheral = SPI DR NVIC_SetPriority(DMA2_Stream3_IRQn, 6); NVIC_EnableIRQ(DMA2_Stream3_IRQn); SPI1->CR1 |= SPI_CR1_SPE; // Enable SPI } /* Kick off a DMA transfer — CS must be asserted by caller */ void SPI1_DMA_Send(const uint8_t *buf, uint16_t len) { DMA2_Stream3->NDTR = len; DMA2_Stream3->M0AR = (uint32_t)buf; DMA2_Stream3->CR |= DMA_SxCR_EN; // Start } void DMA2_Stream3_IRQHandler(void) { if (DMA2->LISR & DMA_LISR_TCIF3) { DMA2->LIFCR = DMA_LIFCR_CTCIF3; // Clear flag // Wait for SPI shift register to finish, THEN deassert CS while (SPI1->SR & SPI_SR_BSY); CS_Deassert(); } }
Always wait for BSY=0 before releasing CS
The DMA TC interrupt fires when the last byte is written to the SPI TX FIFO — not when it's been shifted out on the wire. Check
SPI_SR_BSY before asserting CS high, or the last byte will be cut short.Continue the Series
Work through all 7 parts of Bare Metal STM32 to master low-level embedded development from the ground up.