MCU Bootloaders: The basics
I did my master’s thesis and end-of-study internship at Witekio France, with the objective of figuring out a reliable way to update PCBs which contain more than one microcontroller, by making my own bootloader.
Don’t worry, I will explain what all of this means soon.
The goal of this project was for me to learn how one writes such a bootloader, and I also picked up valuable knowledge about how flash memory works, how chips can talk with one another and embedded programming in general. I also had to come up with some solutions to the problems that popped up, which I find quite interesting!
I’m thus writing this series of articles to explain what I learned and how I came up with those solutions, rather than letting this knowledge gather dust in my university’s archives.
This first article aims to introduce the concept of a bootloader for microcontrollers,
while remaining accessible to anyone who has a good enough understanding of what’s inside a computer.
So no need to know how to program :)
The next articles will go into further details about how to make a simple bootloader, the solutions I came up with for my own bootloader and the more nerdy stuff.
What is a microcontroller?
It’s a tiny computer!
Well… we kinda have those already, they’re called microprocessors. You know, the thing that’s in your phone? So… what’s so special about microcontrollers?
Well, they’re very cheap and very power efficient. Think of devices like earbuds (you want them to hold their charge for long) or garage wireless keys (which customers expect to only cost a few bucks).
These benefits come with severe limitations though:
- Very little permanent memory (also known as the flash), usually 1MB or less
- Very little working memory (the RAM), anything above 256KB is luxury
- Slow clock speeds, usually measured in megahertz (rather than gigahertz for desktop CPUs)
- Limited I/O: you only have so many pins on the chip
- No virtual memory (more on that later)
If you can manage these limitations, then the added cost of development is usually outweighed by the savings for the thousands of devices produced at a lower cost.
Hold on, “virtual memory”? What’s that?
Virtual memory
Programs running on earlier computers had access to all the RAM and all the storage, which was fine, since usually only one program was running. If you wanted to run two programs, though, you needed to make sure ahead of time that they wouldn’t walk on each other’s toes, otherwise bad things happen:
Good luck figuring out why your linked lists are suddenly out of whack.
Virtual memory solves this problem once and for all: instead of letting a program access all of the memory available, the computer lies to it and pretends that it has access to that memory, while actually redirecting any read or write request to another location in memory.
This is made possible by a chip called the Memory Management Unit (MMU), which efficiently does this redirection and can be reprogrammed at runtime. Now, if you want to run multiple programs, you only need to make sure that their read/write requests are redirected to different areas of memory.
Laurie Wired has an excellent video on this subject, if you want to learn more about the history of virtual memory and how an MMU works.
An example program’s memory accesses with and without an MMU.
Didn’t you say we don’t have virtual memory?
Yes, but this is rarely an issue, since you’re usually the one writing all of the code that will run on your microcontroller. However, in our case, the lack of virtual memory does become problematic:
Among the memory that a running program might request, there’s the program’s code. This means that if it’s not placed in the right spot, it will most likely break (there’s a technique called relocatable code, but I won’t go into details here).
If we wanted our application to update itself, it would thus need to somehow write the new version on top of the running, previous version.
Uh oh!
Rescued by the bootloader
Thankfully for us, clever people figured out that one could instead download the new version of the program, write it somewhere else on the flash, and then let another piece of code, called the bootloader, take care of moving it to its final destination.
A typical bootloader would load the program in RAM and then boot into it (hence the name). On a microcontroller, however, the program can directly be executed from the flash, so no need to load it. (Also the name “bootmover” isn’t a catchy.)
There are already multiple bootloaders for microcontrollers that exist out there, each with a different set of features for different usecases:
- One might want to focus on reliability, by ensuring that any update is reliable, can be recovered or rolled back, or that there is always a recovery executable to run if all else fails.
- One might want to focus on usability, by having verbose logs and several ways to talk with the bootloader.
- One might want to focus on efficiency, by splitting the application into small bits, that can each be updated independently.
None of the options I found really adressed one usecase: what if you have two microcontrollers?
Why would you even want to have two microcontrollers?
Because of the several limitations of microcontrollers, it might sometimes be tricky to fit everything onto one microcontroller, while being more economical to not outright switch to a microprocessor. A typical scenario is the following:
- One microcontroller specializes into reading sensors and displaying information to the user.
- Another specializes into communicating with the external world over WiFi or Bluetooth.
In those cases, the existing solutions for updating microcontrollers don’t work cleanly. The bootloaders I could find either only support a single device, or you have to hack stuff together to get it to update both, sacrificing reliability and flexibility.
Making my own bootloader
The goal for my bootloader was to specialize into updating multiple microcontrollers, at once, without sacrificing reliability nor flexibility. I called it Polyboot, and the features planned for it were the following:
- It should update the program of each microcontroller.
- It should undo an update if the new program is flagged as bad (this is also known as an A/B update).
- It should resist being turned off mid-update (such a device is likely to be on a battery, which may run out of charge).
- It should have some configuration file (to describe each microcontroller and how they can talk together).
- It should strive to be compatible with tools built for MCUBoot (the most popular bootloader for microcontrollers).
- It should have command-line tools to ease the developer experience.
- It should verify that the new versions of each program is cryptographically genuine.
Of these features, I managed to implement all but the last one. It would be quite easy to add cryptographic utilities, but I instead chose to spend the last few days of my internship on optimizing, documenting and cleaning up my code.
One feature that wasn’t initially planned is that with Polyboot, one can choose to dedicate the entirety of a microcontroller’s internal storage to the current version, instead of reserving half of it for the new version. This naturally became possible as I made my algorithms able to work with multiple microcontrollers, and it means that one could save costs by buying a less expensive microcontroller.
Polyboot is implemented in Rust, because the support for embedded devices has gotten quite good, it doesn’t need to interact with the program much and it allowed me to focus on other things than memory safety.
I also quickly realized that unit testing my Rust code woud be a lot easier than testing C code.
So how does it work?
There’s still a lot to cover before I can get to that! Things would become a bit too technical to belong in this introduction, though, so I chose to make the cut here.
Head over to the next article, Resisting shutdowns, to have a look at how bootloaders can update an application and withstand sudden shutdowns!