Wednesday 5 August 2009

Cortex-M3: Fault exception on memcpy()

I've been doing some work with the STM32 platform which uses a Cortex-M3 processor core. It's an impressive little chip and I was amazed to see that the Cortex-M3 supports unaligned data accesses (something that was always a pain on the ARM7TDMI that it is aimed at replacing)

So I was very surprised when I found that a memcpy() from a packet buffer to a structure raised a Fault exception whereas a naive structure copy operation worked just fine.

From the STM32 datasheet,
The Cortex-M3 processor supports unaligned access only for the following instructions:
● LDR, LDRT
● LDRH, LDRHT
● LDRSH, LDRSHT
● STR, STRT
● STRH, STRHT
All other load and store instructions generate a usage fault exception if they perform an unaligned access, and therefore their accesses must be address aligned.

The problem was that the GCC compiler optimises certain instances of memcpy() and structure assignments into Load Multiple (LDMIA) and Store Multiple (STMIA) instructions. If instead you write your own word-at-a-time memcpy() macro the compiler generates LDR and STR instruction which execute nearly as fast but also work for unaligned reads or writes.

static inline memcpy(ptr_t dst, ptr_t src, size_t sz)
{
if (sz & 1)
{
*(uint8_t*)dst = *(uint8_t*)src;
src++;
dst++;
sz--;
}
if (sz & 2)
{
*(uint16_t*)dst = *(uint16_t*)src;
src += 2;
dst += 2;
sz -= 2;
}
while(sz)
{
*(uint32_t*)dst = *(uint32_t*)src;
src += 4;
dst += 4;
sz -= 4;
}
}

2 comments:

  1. Are you using the IAR - EWARM 5.1x tools? If so, there is a patch for the memcpy() function available as of March 2008.

    ReplyDelete
  2. Nope - gcc. Thanks for the tip.
    I guess this is how IAR justify their astronomical prices ;-)

    ReplyDelete