Things to Think About When Doing Low-Level Embedded Systems Development.

7 min readAug 19, 2021

What are the foundations of designing and implementing low-level software that interacts with real-world systems?

(By low level I mean software running on bare-metal or a small footprint RTOS, on a small processor core — Not embedded Linux on a Cortex-A class MCU)

An introductory embedded systems book will often go through the basics of startup code, installing interrupt handlers, writing drivers. I also wrote a series of posts on the same topic for RISC-V. I find that stuff interesting and have had to do it professionally.

However, for someone using an off-the-shelf MCU, all that stuff is provided by the silicon vendor. Do you need to know it well? I would say you do not need to know it well.

A mental model of the problem space is ultimately what you should be building when working through such an introductory topic. While you may never need to implement the startup code of an MCU, being exposed to it will allow a mental model to be built. It will help answer some questions such as “Who set up my stack? How much stack do I have?”.

To aid in that I’d like to present a list of random topics that can be used to build a mental model that characterizes firmware of low-level embedded systems.

Building a Mental Model

The list is presented as questions I ask myself when thinking about a firmware system. The questions are without answers, as the purpose is to query and elaborate a mental model. I’m assuming the list is incomplete.

These are topics are things that I think any low-level firmware developer will eventually get to know, but high-level application software developer will almost never need to deal with. I’ve often made design decisions by asking these questions, OR been stung by not asking the questions earlier myself or seen smart people be stung by not thinking about them.

The ISA and ABI:

What registers does my ISA have, and how they are used by the compiler?
What is the calling convention? How much stack space is consumed by an ISR or a function call?
Can I at least quickly guess what any assembler mnemonic might mean?

The Stack:

What is the stack? Where is it? Who put it there? What register points to it?
How much stack memory do I have? Is it shared between interrupt context and task context?
What am I doing to avoid stack overflow? What am I storing on the stack?
What happens when I overflow the stack, how does my software recover?

Globals/Heap:

Where/how are globals initialized? What about function local statics?
Do I need to have a heap? What might need to be allocated/de-allocated at runtime?

Memory:

Where is the data stored? What is tightly coupled memory? What is cache? What’s the difference and what does my platform use?
What is memory alignment? What is the width of my platform’s bus?
Where is the code stored? Is random access to code penalty-free?

(The MCU user manual should describe the memory architecture of the platform you are using.)

Interrupts/Critical Sections (The hardware):

How long does it take to enter/exit an ISR? What is tail chaining and does my platform benefit from it?
How do my critical sections work, do they block all interrupts, block a priority level, or a mask?
How long does a critical section take to enter and exit? What is my longest timed critical section?

(Interrupts are what make real-time systems real-time and how they operate is a major part of an MCUs architecture.)

Interrupts/Critical Sections (The software):

How am I prioritizing my interrupts? What is the most time-critical ISR?
Are there interrupts that are too time-critical to be blocked by a critical section?
What is interrupt nesting? What is the maximum nesting level of my interrupts? Do I have enough stack if they all nest?
Why should I not do too much work in a high-priority interrupt handler or critical section?
How do I delegate work from a high-priority interrupt to a lower-priority context?

(At the lowest level the software architecture is often driven by interrupt handling and making sure it happens on time and doesn’t break anything. For software, critical sections are just as important to understand as the interrupts, as those allow your software to survive interruption, but in the process can kill your real timeliness!)

Registers:

How do MMIO registers and system registers work? What’s the difference?
What are some gotcha’s with MMIOs, due to the fact they aren’t really memory?
Why should I use volatileto access MMIO?
Why should I be careful of the size of the memory bus used to access MMIO?
Why does reading a 64-bit timer register over a 32-bit bus occasionally have unexpected results?
Why might writing a set of 32-bit registers via a uint8_t* pointer lead to strange results?
What registers can I read-write-modify safely? Why do some registers have redundant views? (e.g. set register, clear register)
What are the modes, read-only, write-only, write-once, trigger-on-write, etc?

(There is often no need to write a driver from scratch, but you will often need to understand how one works and debug it — understanding MMIO and interrupts is critical.)

Execution privilege:

What are the different execution levels/privilege levels of my ISA? What permissions does each level have?
e.g Cortex-M ARMv6: Supervisor and task with just a different stack.
e.g Cortex-M ARMv7: Supervisor and task with different privileges.
e.g Cortex-R ARMv8 : (Hypervisor, Kernel, Task)
How do you transition between them? What level does the startup code exit to?
Does my RTOS use them? Do I need to call into a higher privilege to access some features?
Does my RTOS have different function calls depending on my context?

What About RTOSs? What about XYZ?

I specifically haven't included things like RTOS task creation, synchronization, etc. These are really just specialized application libraries — and in that sense similar to general application programming. For the same reason, I haven’t touched on all sorts of application support libraries (IoT libraries, DSP libraries, control libraries, etc).

I also haven’t included topics regarding external IO and communications interfaces (UART/I2C/SPI/USB etc). Those are important but out of the scope of the topic of writing software.

Conclusion

As I write topics for this blog, the above list includes many things I’d like to cover, and some are covered by the articles I’ve already written. I find the intersection of hardware and software interesting, and that is where many “gotcha’s” for embedded software development live.

To best approach any domain a good mental model of the system is needed, it allows you to make design and implementation judgments from first principles and do mental simulations of the expected system behaviors.

Application-Specific Questions

These are some topics that are not general enough to make it to the list above. I’ve placed them here as a reminder to myself that I can use them to refresh my mental model.

Low power/clock gating.

There are applications that require very low power consumption, it’s the hardware that consumes the power and needs to be turned off — but often it’s the firmware that decides what hardware needs to be active at any given time.

What is a clock, and why does the CPU need one? What does it mean to “gate” a clock?
What is and why is there a WFI/WFE instruction?
What power mode do I go into during WFI? What platform-dependent register controls the clocks? What clocks keep running?
Do I disable interrupts during clock gating control and WFI? What do you mean WFI is woken while interrupts are disabled? How can that even work?
If my CPU clock is disabled, do I wake up? (e.g. is there a special GPIO or comms port that can do asynchronous wake-up?)
How long does it take to exit a low power mode? Does an oscillator need to power up?
What blocks are powered off at any time? What happens if I write to a powered-off MMIO?
What happens to my timers (RTOS tick, real-time counter, etc) when I turn off clocks?

Analog Interfaces.

Analog hardware is a separate domain to digital hardware, with different designers who have completely different ideas on how things work.

What is analog hardware? Why is it different from digital hardware?
Flash memory is an analog circuit!? Did I need to enable some high voltage line and wait a bit before I write to flash?
Pin IO is an analog circuit!? What is a pull-up, pull-down, drive strength, floating pins, (pseudo) open-drain/collector?
Does everything in analog-land get controlled by digital hardware or my firmware?
How does everything in analog-land synchronize to the same clock as digital-land?
Does the block have another power source? Reference voltage/current? How long do I wait after a block has been enabled until it can be used?
What is a setup time? What is a hold time?
What is the default state of a comparator? Does initialization trigger an edge and interrupt that I’m not expecting? What other corner cases can occur on the boundary?