Boot time: choose your kernel loading address carefully

When the compressed and uncompressed kernel images overlap

At least on ARM32, there seems to be many working addresses where the compressed kernel can be loaded in RAM. For example, one can load the compressed kernel at offset 0x1000000 (16 MB) from the start of RAM, and the Device Tree Blog (DTB) at offset 0x2000000 (32 MB). Whatever this loading address, the kernel is then decompressed at offset 0x8000 from the start of RAM, as explained this the famous How the ARM32 Linux kernel decompresses article from Linus Walleij.

There is a potential issue with the loading address of the compressed kernel, as explained in the article too. If the compressed kernel is loaded too close to the beginning of RAM, where the kernel must be decompressed, there will be an overlap between the two. The decompressed kernel will overwrite the compressed one, potentially breaking the decompression process.

Overlapping compressed and decompressed kernel

As you see in the above diagram, when this happens, the bootstrap code in the compressed kernel will first copy the compressed image to a location that’s far enough to guarantee that the decompressed kernel won’t overlap it. However, this extra step in the boot process has a cost.

Measuring boot time impact

In the context of updating our materials for our upcoming Embedded Linux Boot Time Optimization course in June, we measured this additional time on the STM32MP157A-DK1 Discovery Kit from STMicroelectronics, with a dual-core ARM Cortex-A7 CPU running at 650 MHz.

Initially, in our Embedded Linux System Development course, we were booting the DK1 board as follows:

ext4load mmc 0:4 0xc0000000 zImage; ext4load mmc 0:4 0xc4000000 dtb; bootz 0xc0000000 - 0xc4000000

0xc0000000 is exactly the beginning of RAM! We are therefore in the overlap situation.

We used grabserial from Tim Bird to measure the time between Starting kernel in U-Boot and when the compressed kernel starts executing (Booting Linux on physical CPU 0x0):

...
[4.451996 0.000124] Starting kernel ...
[0.001838 0.001838] 
[2.439980 2.438142] [    0.000000] Booting Linux on physical CPU 0x0
...

On a series of 5 identical tests, we obtained an average time of 2,440 ms, with a standard deviation of 0.4 ms.

Then, we measured the optimum case, in which the compressed kernel is loaded far enough from the beginning of RAM so that no overlap is possible:

No overlap between compressed and decompressed kernel

Here we chose to load the kernel at 0xc2000000:

ext4load mmc 0:4 0xc2000000 zImage; ext4load mmc 0:4 0xc4000000 dtb; bootz 0xc2000000 - 0xc4000000

On a series of 5 identical tests, we obtained an average time of 2,333 ms, with a standard deviation of 0.7 ms.

The new average is 107 ms smaller, which you are likely to consider as a worthy reduction, if you have experience with boot time reduction projects.

What to remember

In your embedded projects, if you are using a compressed kernel, make sure it is loaded far enough from the beginning of RAM, leaving enough space for the decompressed kernel to fit in between. Otherwise, your system will still be able to boot, but depending on the speed of your CPU and storage, it will be slower, from a few tens to a few hundreds of milliseconds.

We checked the How to optimize the boot time page on the STM32 wiki, and it recommends optimum loading addresses: 0xc2000000 for the kernel and 0xc4000000 for the device tree. This way, the upper limit for the decompressed kernel is 32 MB, which is more than enough.

If you are directly using an uncompressed kernel, which is more rare, you should also make sure that it is loaded at an optimum location, so that there is no need to move it before starting it.

Starting Linux directly from AT91bootstrap3

Here is an update for our previous article on booting linux directly from AT91bootstrap. On newer ATMEL platforms, you will have to use AT91bootstrap 3. It now has a convenient way to be configured to boot directly to Linux.

You can check it out from github:

git clone git://github.com/linux4sam/at91bootstrap.git

That version of AT91bootstrap is using the same configuration mechanism as the Linux kernel. You will find default configurations, named in the form:
<board_name><storage>_<boot_strategy>_defconfig

  • board_name can be: at91sam9260ek, at91sam9261ek, at91sam9263ek, at91sam9g10ek, at91sam9g20ek, at91sam9m10g45ek, at91sam9n12ek, at91sam9rlek, at91sam9x5ek, at91sam9xeek or at91sama5d3xek
  • storage can be:
    • df for DataFlash
    • nf for NAND flash
    • sd for SD card
  • our main interest will be in boot_strategy which can be:
    • uboot: start u-boot or any other bootloader
    • linux: boot Linux directly, passing a kernel command line
    • linux_dt: boot Linux directly, using a Device Tree
    • android: boot Linux directly, in an Android configuration

Let’s take for example the latest evaluation boards from ATMEL, the SAMA5D3x-EK. If you are booting from NAND flash:

make at91sama5d3xeknf_linux_dt_defconfig
make

You’ll end up with a file named at91sama5d3xek-nandflashboot-linux-dt-3.5.4.bin in the binaries/ folder. This is your first stage bootloader. It has the same storage layout as used in the u-boot strategy so you can flash it and it will work.

As a last note, I’ll had that less is not always faster. On our benchmarks, booting the SAMA5D31-EK using AT91bootstrap, then Barebox was faster than just using AT91bootstrap. The main reason is that barebox is actually enabling the caches and decompresses the kernel(see below, the kernel is also enaling the caches before decompressing itself) before booting.

How to boot an uncompressed Linux kernel on ARM

This is a quick post to share my experience booting uncompressed Linux kernel images, during the benchmarks of kernel compression options, and no compression at all was one of these options.

It is sometimes useful to boot a kernel image with no compression. Though the kernel image is bigger, and takes more time to copy from storage to RAM, the kernel image no longer has to be decompressed to RAM. This is useful for systems with a very slow CPU, or very little RAM to store both the compressed and uncompressed images during the boot phase. The typical case is booting CPUs emulated by FPGA, during processor development, before the final silicon is out. For example, I saw a Cortex A15 chip boot at 11 MHz during Linaro Connect Q2.11 in Budapest. At this clock frequency, booting a kernel image with no compression saves several minutes of boot time, reducing development and test time. Note that with such hardware emulators, copying the kernel image to RAM is cheap, as it is done by the emulator from a file given by the user, before starting to emulate the system.

Building a kernel image with no compression on ARM is easy, but only once you know where the uncompressed image is and what to do! For people who have never done that before, I’m sharing quick instructions here.

To generate your uncompressed kernel image, all you have to do is run the usual make command. The file that you need is arch/arm/boot/Image.

Depending on the bootloader that you use, this could be sufficient. However, if you use U-boot, you still need to put this image in a uImage container, to let U-boot know about details such as how big the image is, what its entry point is, whether it is compressed or not… The problem is you can’t run make uImage any more to produce this container. That’s because Linux on ARM has no configuration option to keep the kernel uncompressed, and the uImage file would contain a compressed kernel.

Therefore, you have to create the uImage by invoking the mkimage command manually. To do this without having to guess the right mkimage parameters, I recommend to run make V=1 uImage once:

$ make V=1 uImage
...
  Kernel: arch/arm/boot/zImage is ready
  /bin/bash /home/mike/linux/scripts/mkuboot.sh -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/zImage arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 13:54:00 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    3351272 Bytes = 3272.73 kB = 3.20 MB
Load Address: 80008000
Entry Point:  80008000
  Image arch/arm/boot/uImage is ready

Don’t be surprised if the above message says that the kernel is uncompressed (corresponding to -C none). If we told U-boot that the image is already compressed, it would take care of uncompressing it to RAM before starting the kernel image.

Now, you know what mkimage command you need to run. Just invoke this command on the Image file instead of zImage (you can directly replace mkuboot.sh by mkimage):

$ mkimage -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/Image arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 14:02:27 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    6958068 Bytes = 6794.99 kB = 6.64 MB
Load Address: 80008000
Entry Point:  80008000

Now, you can use your uImage file as usual.

CALAO boards supported in mainline U-Boot

I’m happy to announce that a couple days ago, support for the CALAO SBC35-A9G20, TNY-A9260 and TNY-A9G20 boards made its way into the U-Boot git repository. Sadly, it’s not possible to boot from an MMC/SD card with the SBC35 yet, but it’s something I’m currently working on.

Support for all these cards will be available in the next U-Boot release, due in November.