So I was very surprised when I found that a memcpy() from a packet buffer to a structure raised a Fault exception whereas a naive structure copy operation worked just fine.
From the STM32 datasheet,
The Cortex-M3 processor supports unaligned access only for the following instructions:
● LDR, LDRT
● LDRH, LDRHT
● LDRSH, LDRSHT
● STR, STRT
● STRH, STRHT
All other load and store instructions generate a usage fault exception if they perform an unaligned access, and therefore their accesses must be address aligned.
The problem was that the GCC compiler optimises certain instances of memcpy() and structure assignments into Load Multiple (LDMIA) and Store Multiple (STMIA) instructions. If instead you write your own word-at-a-time memcpy() macro the compiler generates LDR and STR instruction which execute nearly as fast but also work for unaligned reads or writes.
static inline memcpy(ptr_t dst, ptr_t src, size_t sz)
{
if (sz & 1)
{
*(uint8_t*)dst = *(uint8_t*)src;
src++;
dst++;
sz--;
}
if (sz & 2)
{
*(uint16_t*)dst = *(uint16_t*)src;
src += 2;
dst += 2;
sz -= 2;
}
while(sz)
{
*(uint32_t*)dst = *(uint32_t*)src;
src += 4;
dst += 4;
sz -= 4;
}
}
Are you using the IAR - EWARM 5.1x tools? If so, there is a patch for the memcpy() function available as of March 2008.
ReplyDeleteNope - gcc. Thanks for the tip.
ReplyDeleteI guess this is how IAR justify their astronomical prices ;-)