For all embedded Linux developers, cross-compilation toolchains are part of the basic tool set, as they allow to build code for a specific CPU architecture and debug it. Until a few years ago, CodeSourcery was providing a lot of high quality pre-compiled toolchains for a wide range of architectures, but has progressively stopped doing so. Linaro provides some freely available toolchains, but only targetting ARM and AArch64. kernel.org has a set of pre-built toolchains for a wider range of architectures, but they are bare metal toolchains (cannot build Linux userspace programs) and updated infrequently.
This web site provides a large number of cross-compilation toolchains, available for a wide range of architectures, in multiple variants. The toolchains are based on the classical combination of gcc, binutils and gdb, plus a C library. We currently provide a total of 138 toolchains, covering many combinations of:
Architectures: AArch64 (little and big endian), ARC, ARM (little and big endian, ARMv5, ARMv6, ARMv7), Blackfin, m68k (Coldfire and 68k), Microblaze (little and big endian), MIPS32 and MIPS64 (little and big endian, with various instruction set variants), NIOS2, OpenRISC, PowerPC and PowerPC64, SuperH, Sparc and Sparc64, x86 and x86-64, Xtensa
Versions: for each combination, we provide a stable version which uses slightly older but more proven versions of gcc, binutils and gdb, and we provide a bleeding edge version with the latest version of gcc, binutils and gdb.
After being generated, most of the toolchains are tested by building a Linux kernel and a Linux userspace, and booting it under Qemu, which allows to verify that the toolchain is minimally working. We plan on adding more tests to validate the toolchains, and welcome your feedback on this topic. Of course, not all toolchains are tested this way, because some CPU architectures are not emulated by Qemu.
The toolchains are built with Buildroot, but can be used for any purpose: build a Linux kernel or bootloader, as a pre-built toolchain for your favorite embedded Linux build system, etc. The toolchains are available in tarballs, together with licensing information and instructions on how to rebuild the toolchain if needed.
We are very much interested in your feedback about those toolchains, so do not hesitate to report bugs or make suggestions in our issue tracker!
This work was done as part of the internship of Florent Jacquet at Free Electrons.
Since 2006, we have provided a Linux source code cross-referencing online tool as a service to the community. The engine behind this website was LXR, a Perl project almost as old as the kernel itself. For the first few years, we used the then-current 0.9.5 version of LXR, but in early 2009 and for various reasons, we reverted to the older 0.3.1 version (from 1999!). In a nutshell, it was simpler and it scaled better.
Recently, we had the opportunity to spend some time on it, to correct a few bugs and to improve the service. After studying the Perl source code and trying out various cross-referencing engines (among which LXR 2.2 and OpenGrok), we decided to implement our own source code cross-referencing engine in Python.
Why create a new engine?
Our goal was to extend our existing service (support for multiple projects, responsive design, etc.) while keeping it simple and fast. When we tried other cross-referencing engines, we were dissatisfied with their relatively low performance on a large codebase such as Linux. Although we probably could have tweaked the underlying database engine for better performance, we decided it would be simpler to stick to the strategy used in LXR 0.3: get away from the relational database engine and keep plain lists in simple key-value stores.
Another reason that motivated a complete rewrite was that we wanted to provide an up-to-date reference (including the latest revisions) while keeping it immutable, so that external links to the source code wouldn’t get broken in the future. As a direct consequence, we would need to index many different revisions for each project, with potentially a lot of redundant information between them. That’s when we realized we could leverage the data model of Git to deal with this redundancy in an efficient manner, by indexing Git blobs, which are shared between revisions. In order to make sure queries under this strategy would be fast enough, we wrote a proof-of-concept in Python, and thus Elixir was born.
What service does it provide?
First, we tried to minimize disruption to our users by keeping the user interface close to that of our old cross-referencing service. The main improvements are:
We now support multiple projects. For now, we provide reference for Linux, Busybox and U-Boot.
Every tag in each project’s git repository is now automatically indexed.
The design has been modernized and now fits comfortably on smaller screens like tablets.
The URL scheme has been simplified and extended with support for multiple projects. An HTTP redirector has been set up for backward compatibility.
Elixir supports multiple projects
Among other smaller improvements, it is now possible to copy and paste code directly without line numbers getting in the way.
How does it work?
Elixir is made of two Python scripts: “update” and “query”. The first looks for new tags and new blobs inside a Git repository, parses them and appends the new references to identifiers to a record inside the database. The second uses the database and the Git repository to display annotated source code and identifier references.
The parsing itself is done with Ctags, which provides us with identifier definitions. In order to find the references to these identifiers, Elixir then simply checks each lexical token in the source file against the definition database, and if that word is defined, a new reference is added.
Like in LXR 0.3, the database structure is kept very simple so that queries don’t have much work to do at runtime, thus speeding them up. In particular, we store references to a particular identifier as a simple list, which can be loaded and parsed very fast. The main difference with LXR is that our list includes references from every blob in the project, so we need to restrict it first to only the blobs that are part of the current version. This is done at runtime, simply by computing the intersection of this list with the list of blobs inside the current version.
Finally, we kept the user interface code clearly segregated from the engine itself by making these two modules communicate through a Unix command-line interface. This means that you can run queries directly on the command-line without going through the web interface.
Elixir code example
Our current focus is on improving multi-project support. In particular, each project has its own quirky way of using Git tags, which needs to be handled individually.
At the user-interface level, we are evaluating the possibility of having auto-completion and/or fuzzy search of identifier names. Also, we are looking for a way to provide direct line-level access to references even in the case of very common identifiers.
On the performance front, we would like to cut the indexation time by switching to a new database back-end that provides efficient appending to large records. Also, we could make source code queries faster by precomputing the references, which would also allow us to eliminate identifier “bleeding” between versions (the case where an identifier shows up as “defined in 0 files” because it is only defined in another version).
If you think of other ways we could improve our service, don’t hesitate to drop us a feature request or a patch!
Bonus: why call it “Elixir”?
In the spur of the moment, it seemed like a nice pun on the name “LXR”. But in retrospect, we wish to apologize to the Elixir language team and the community at large for unnecessary namespace pollution.
Since April 2016, we have our own automated testing infrastructure to validate the Linux kernel on a large number of hardware platforms. We use this infrastructure to contribute to the KernelCI project, which tests every day the Linux kernel. However, the tests being done by KernelCI are really basic: it’s mostly booting a basic Linux system and checking that it reaches a shell prompt.
However, LAVA, the software component at the core of this testing infrastructure, can do a lot more than just basic tests.
The need for custom tests
With some of our engineers being Linux maintainers and given all the platforms we need to maintain for our customers, being able to automatically test specific features beyond a simple boot test was a very interesting goal.
In addition, manually testing a kernel change on a large number of hardware platforms can be really tedious. Being able to quickly send test jobs that will use an image you built on your machine can be a great advantage when you have some new code in development that affects more than one board.
We identified two main use cases for custom tests:
Automatic tests to detect regression, as does KernelCI, but with more advanced tests, including platform specific tests.
Manual tests executed by engineers to validate that the changes they are developing do not break existing features, on all platforms.
An appropriate root filesystem, that contains the various userspace programs needed to execute the tests (benchmarking tools, validation tools, etc.)
A test suite, which contains various scripts executing the tests
A custom test tool that glues together the different components
The custom test tool knows all the hardware platforms available and which tests and kernel configurations apply to which hardware platforms. It identifies the appropriate kernel image, Device Tree, root filesystem image and test suite and submits a job to LAVA for execution. LAVA will download the necessary artifacts and run the job on the appropriate device.
Building custom rootfs
When it comes to test specific drivers, dedicated testing, validation or benchmarking tools are sometimes needed. For example, for storage device testing, bonnie++ can be used, while iperf is nice for networking testing. As the default root filesystem used by KernelCI is really minimalist, we need to build our owns, one for each architecture we want to test.
Buildroot is a simple yet efficient tool to generate root filesystems, it is also used by KernelCI to build their minimalist root filesystems. We chose to use it and made custom configuration files to match our needs.
We ended up with custom rootfs built for ARMv4, ARMv5, ARMv7, and ARMv8, that embed for now Bonnie++, iperf, ping (not the Busybox implementation) and other tiny tools that aren’t included in the default Buildroot configuration.
Our Buildroot fork that includes our custom configurations is available as the buildroot-ci Github project (branch ci).
The custom test tool
The custom test tool is the tool that binds the different elements of the overall architecture together.
One of the main features of the tool is to send jobs. Jobs are text files used by LAVA to know what to do with which device. As they are described in LAVA as YAML files (in the version 2 of the API), it is easy to use templates to generate them based on a single model. Some information is quite static such as the device tree name for a given board or the rootfs version to use, but other details change for every job such as the kernel to use or which test to run.
We made a tool able to get the latest kernel images from KernelCI to quickly send jobs without having a to compile a custom kernel image. If the need is to test a custom image that is built locally, the tool is also able to send files to the LAVA server through SSH, to provide a custom kernel image.
The entry point of the tool is ctt.py, which allows to create new jobs, providing a lot of options to define the various aspects of the job (kernel, Device Tree, root filesystem, test, etc.).
The test suite is a set of shell scripts that perform tests returning 0 or 1 depending on the result. This test suite is included inside the root filesystem by LAVA as part of a preparation step for each job.
We currently have a small set of tests:
boot test, which simply returns 0. Such a test will be successful as soon as the boot succeeds.
crypto test, to do some minimal testing of cryptographic engines
usb test, to test USB functionality using mass storage devices
simple network test, that just validates network connectivity using ping
All those tests only require the target hardware platform itself. However, for more elaborate network tests, we needed to get two devices to interact with each other: the target hardware platform and a reference PC platform. For this, we use the LAVA MultiNode API. It allows to have a test that spans multiple devices, which we use to perform multiple iperf sessions to benchmark the bandwidth. This test has therefore one part running on the target device (network-board) and one part running on the reference PC platform (network-laptop).
Our current test suite is available as the test_suite Github project. It is obviously limited to just a few tests for now, we hope to extend the tests in the near future.
First use case: daily tests
As previously stated, it’s important for us to know about regressions introduced in the upstream kernel. Therefore, we have set up a simple daily cron job that:
Sends custom jobs to all boards to validate the latest mainline Linux kernel and latest linux-nextli>
Aggregates results from the past 24 hours and sends emails to subscribed addresses
Updates a dashboard that displays results in a very simple page
A nice dashboard showing the tests of the Beaglebone Black and the Nitrogen6x.
Second use case: manual tests
The custom test tool ctt.py has a simple command line interface. It’s easy for someone to set it up and send custom jobs. For example:
ctt.py -b beaglebone-black -m network
will start the network test on the BeagleBone Black, using the latest mainline Linux kernel built by KernelCI. On the other hand:
will run the mmc test on the Marvell Armada 7040 and Armada 8040 development boards, using the locally built kernel image and Device Tree.
The result of the job is sent over e-mail when the test has completed.
Thanks to this custom test tool, we now have an infrastructure that leverages our existing lab and LAVA instance to execute more advanced tests. Our goal is now to increase the coverage, by adding more tests, and run them on more devices. Of course, we welcome feedback and contributions!
Over the last few releases, a significant number of improvements in terms of QA-related tooling has been done in the Buildroot project. As an embedded Linux build system, Buildroot has a growing number of packages, and maintaining all of those packages is a challenge. Therefore, improving the infrastructure around Buildroot to make sure that packages are in good shape is very important. Below we provide a summary of the different improvements that have been made.
Very much like the Linux kernel has a checkpatch.pl script to help contributors validate their patches, Buildroot now has a check-package script that allows to validate the coding style and check for common errors in Buildroot packages. Contributors are encouraged to use it to avoid common mistakes typically spotted during the review process.
check-package is capable of checking the .mk file, the Config.in, the .hash file of packages as well as the patches that apply to packages.
Contributed by Ricardo Martincoski, and written in Python, this tool will first appear in the next stable release 2017.05, to be published at the end of the month.
Buildroot’s page of package stats has been updated with a new column Warnings that lists the number of check-package issues to be fixed on each package. Not all packages have been fixed yet!
Besides coding style issues, another problem that the Buildroot community faces when accepting contributions of new packages or package updates, is that those contributions have rarely been tested on a large number of toolchain/architecture configurations. To help contributors in this testing, a test-pkg tool has been added.
Provided a Buildroot configuration snippet that enables the package to be tested, test-pkg tool will iterate over a number of toolchain/architecture configurations and make sure the package builds fine for all those configurations. The set of configurations being tested is the one used by Buildroot autobuilder’s infrastructure, which allows us to make sure that a package will not fail to build as soon as it is added into the tree. Contributors are therefore encouraged to use this tool for their package related contributions.
A very primitive version of this tool was originally contributed by Thomas Petazzoni, but it’s finally Yann E. Morin who took over, cleaned up the code, extended it and made it more generic, and contributed the final tool.
Runtime testing infrastructure
The Buildroot autobuilder infrastructure has been running for several years, and tests random configurations of Buildroot packages to make sure they build properly. This infrastructure allows the Buildroot developers to make sure that all combinations of packages build properly on all architectures, and has been a very useful tool to help increase Buildroot quality. However, this infrastructure does not perform any sort of runtime testing.
To address this, a new runtime testing infrastructure has recently been contributed to Buildroot by Thomas Petazzoni. Contrary to the autobuilder infrastructure that tests random configurations, this runtime testing infrastructure tests a well-defined set of configurations, and uses Qemu to make sure that they work properly. Located in support/testing this test infrastructure currently has only 25 test cases, but we plan to extend this over time with more and more tests.
For example, the ISO9660 test makes sure that Buildroot is capable of building an ISO9660 image that boots properly under Qemu. The test suite validates that it works in different configurations: using either Grub, Grub2 or Syslinux as a bootloader, and using a root filesystem entirely contained in an initramfs or inside the ISO9660 filesystem itself.
Besides doing run-time tests of packages, this infrastructure will also allow us to test various core Buildroot functionalities. We also plan to have the tests executed on a regular basis on a CI infrastructure such as Gitlab CI.
DEVELOPERS file and e-mail notification
Back in the 2016.11 release, we added a top-level file called DEVELOPERS, which plays more or less the same role as the Linux kernel’s MAINTAINERS file: associate parts of Buildroot, especially packages, with developers interested in this area. Since Buildroot doesn’t have a concept of per-package maintainers, we decided to simply call the file DEVELOPERS.
Thanks to the DEVELOPERS file, we are able to:
Provide the get-developers tool, which parses patches and returns a list of e-mail addresses to which the patches should be sent, very much like the Linux kernel get_maintainer.pl tool. This allows developers who have contributed a package to be notified when a patch is proposed for the same package, and get the chance to review/test the patch.
Notify the developers of a package of build failures caused by this package in the Buildroot autobuilder infrastructure. So far, all build results of this infrastructure were simply sent every day on the mailing list, which made it unpractical for individual developers to notice which build failures they should look at. Thanks to the DEVELOPERS file, we now send a daily e-mail individually to the developers whose packages are affected by build failures.
Notify the developers of the support for a given CPU architecture of build failures caused on the autobuilders for this architecture, in a manner similar to what is done for packages.
Thanks to this, we have seen developers who were not regularly following the Buildroot mailing list contribute again fixes for build failures caused by their packages, increasing the overall Buildroot quality. The DEVELOPERS file and the get-developers script, as well as the related improvements to the autobuilder infrastructure were contributed by Thomas Petazzoni.
Build testing of defconfigs on Gitlab CI
Buildroot contains a number of example configurations called defconfigs for various hardware platforms, which allow to build a minimal embedded Linux system, known to work on such platforms. At the time of this writing, Buildroot has 146 defconfigs for platforms ranging from popular development boards (Raspberry Pi, BeagleBone, etc.) to evaluation boards from SoC vendors and Qemu machine emulation.
In order to make sure all those defconfigs build properly, we used to have a job running on Travis CI, but we started to face limitations, especially in the maximum allowed build duration. Therefore, Arnout Vandecappelle migrated this on Gitlab CI and things have been running smoothly since then.
In two previous blog posts, we presented the hardware and software architecture of the automated testing platform we have created to test the Linux kernel on a large number of embedded platforms.
The primary use case for this infrastructure was to participate to the KernelCI.org testing effort, which tests the Linux kernel every day on many hardware platforms.
However, since our embedded boards are now fully controlled by LAVA, we wondered if we could not only use our lab for KernelCI.org, but also provide remote control of our boards to Free Electrons engineers so that they can access development boards from anywhere. lavabo was born from this idea and its goal is to allow full remote control of the boards as it is done in LAVA: interface with the serial port, control the power supply and provide files to the board using TFTP.
The advantages of being able to access the boards remotely are obvious: allowing engineers working from home to work on their hardware platforms, avoid moving the boards out of the lab and back into the lab each time an engineer wants to do a test, etc.
From a user’s point of view, lavabo is used through the eponymous command lavabo, which allows to:
List the boards and their status $ lavabo list
Reserve a board for lavabo usage, so that it is no longer used for CI jobs $ lavabo reserve am335x-boneblack_01
Upload a kernel image and Device Tree blob so that it can be accessed by the board through TFTP $ lavabo upload zImage am335x-boneblack.dtb
Connect to the serial port of the board $ lavabo serial am335x-boneblack_01
Reset the power of the board $ lavabo reset am335x-boneblack_01
Power off the board $ lavabo power-off am335x-boneblack_01
Release the board, so that it can once again be used for CI jobs $ lavabo release am335x-boneblack_01
Overall architecture and implementation
The following diagram summarizes the overall architecture of lavabo (components in green) and how it connects with existing components of the LAVA architecture.
lavabo reuses LAVA tools and configuration files
A client-server software
lavabo follows the classical client-server model: the lavabo client is installed on the machines of users, while the lavabo server is hosted on the same machine as LAVA. The server-side of lavabo is responsible for calling the right tools directly on the server machine and making the right calls to LAVA’s API. It controls the boards and interacts with the LAVA instance to reserve and release a board.
On the server machine, a specific Unix user is configured, through its .ssh/authorized_keys to automatically spawn the lavabo server program when someone connects. The lavabo client and server interact directly using their stdin/stdout, by exchanging JSON dictionaries. This interaction model has been inspired from the Attic backup program. Therefore, the lavabo server is not a background process that runs permanently like traditional daemons.
Handling serial connection
Exchanging JSON over SSH works fine to allow the lavabo client to provide instructions to the lavabo server, but it doesn’t work well to provide access to the serial ports of the boards. However, ser2net is already used by LAVA and provides a local telnet port for each serial port. lavabo simply uses SSH port-forwarding to redirect those telnet ports to local ports on the user’s machine.
These additions to the LAVA API are used by the lavabo server to make reserve and release boards, so that there is no conflict between the CI related jobs (such as the ones submitted by KernelCI.org) and the direct use of boards for remote development.
Interaction with the boards
Now that we know how the client and the server interact and also how the server communicates with LAVA, we need a way to know which boards are in the lab, on which port the serial connection of a board is exposed and what are the commands to control the board’s power supply. All this configuration has already been given to LAVA, so lavabo server simply reads the LAVA configuration files.
The last requirement is to provide files to the board, such as kernel images, Device Tree blobs, etc. Indeed, from a network point of view, the boards are located in a different subnet not routed directly to the users machines. LAVA already has a directory accessible through TFTP from the boards which is one of the mechanisms used to serve files to boards. Therefore, the easiest and most obvious way is to send files from the client to the server and move the files to this directory, which we implemented using SFTP.
Since the serial port cannot be shared among several sessions, it is essential to guarantee a board can only be used by one engineer at a time. In order to identify users, we have one SSH key per user in the .ssh/authorized_keys file on the server, each associated to a call to the lavabo-server program with a different username.
This allows us to identify who is reserving/releasing the boards, and make sure that serial port access, or requests to power off or reset the boards are done by the user having reserved the board.
For TFTP, the lavabo upload command automatically uploads files into a per-user sub-directory of the TFTP server. Therefore, when a file called zImage is uploaded, the board will access it over TFTP by downloading user/zImage.
Availability and installation
As you could guess from our love for FOSS, lavabo is released under the GNU GPLv2 license in a GitHub repository. Extensive documentation is available if you’re interested in installing lavabo. Of course, patches are welcome!
While the module includes an SGTL5000 codec, one of the requirements for that project was to handle up to eight audio channels. The SGTL5000 uses I²S and handles only two channels.
I2S timing diagram from the SGTL5000 datasheet
Thankfully, the i.MX7 has multiple audio interfaces and one is fully available on the SODIMM connector of the Colibri iMX7. A TI PCM3168 was chosen for the carrier board and is connected to the second Synchronous Audio Interface (SAI2) of the i.MX7. This codec can handle up to 8 output channels and 6 input channels. It can take multiple formats as its input but TDM takes the smaller number of signals (4 signals: bit clock, word clock, data input and data output).
TDM timing diagram from the PCM3168 datasheet
The current Linux long term support version is 4.9 and was chosen for this project. It has support for both the i.MX7 SAI (sound/soc/fsl/fsl_sai.c) and the PCM3168 (sound/soc/codecs/pcm3168a.c). That’s two of the three components that are needed, the last one being the driver linking both by describing the topology of the “sound card”. In order to keep the custom code to the minimum, there is an existing generic driver called simple-card (sound/soc/generic/simple-card.c). It is always worth trying to use it unless something really specific prevents that. Using it was as simple as writing the following DT node:
Only 4 input channels and 4 output channels are routed because the carrier board only had that wired.
There are two DAI links because the pcm3168 driver exposes inputs and outputs separately
As per the PCM3168 datasheet:
left justified mode is used
dai-tdm-slot-num is set to 8 even though only 4 are actually used
dai-tdm-slot-width is set to 32 because the codec takes 24-bit samples but requires 32 clocks per sample (this is solved later in userspace)
The codec is master which is usually best regarding clock accuracy, especially since the various SoMs on the market almost never expose the audio clock on the carrier board interface. Here, a crystal was used to clock the PCM3168.
The PCM3168 codec is added under the ecspi3 node as that is where it is connected:
Finally, an ALSA configuration file (/usr/share/alsa/cards/imx7-pcm3168.conf) was written to ensure samples sent to the card are in the proper format, S32_LE. 24-bit samples will simply have zeroes in the least significant byte. For 32-bit samples, the codec will properly ignore the least significant byte.
Also this describes that the first subdevice is the playback (output) device and the second subdevice is the capture (input) device.
@args [ CARD ]
On top of that, the dmix and dsnoop ALSA plugins can be used to separate channels.
To conclude, this shows that it is possible to easily leverage existing code to integrate an audio codec in a design by simply writing a device tree snippet and maybe an ALSA configuration file if necessary.
At Free Electrons, we regularly work on networking topics as part of our Linux kernel contributions and thus we decided to attend our very first Netdev conference this year in Montreal. With the recent evolution of the network subsystem and its drivers capabilities, the conference was a very good opportunity to stay up-to-date, thanks to lots of interesting sessions.
Eric Dumazet presenting “Busypolling next generation”
The speakers and the Netdev committee did an impressive job by offering such a great schedule and the recorded talks are already available on the Netdev Youtube channel. We particularly liked a few of those talks.
Andrew Lunn, Viven Didelot and Florian Fainelli presented DSA, the Distributed Switch Architecture, by giving an overview of what DSA is and by then presenting its design. They completed their talk by discussing the future of this subsystem.
DSA in one slide
The goal of the DSA subsystem is to support Ethernet switches connected to the CPU through an Ethernet controller. The distributed part comes from the possibility to have multiple switches connected together through dedicated ports. DSA was introduced nearly 10 years ago but was mostly quiet and only recently came back to life thanks to contributions made by the authors of this talk, its maintainers.
The main idea of DSA is to reuse the available internal representations and tools to describe and configure the switches. Ports are represented as Linux network interfaces to allow the userspace to configure them using common tools, the Linux bridging concept is used for interface bridging and the Linux bonding concept for port trunks. A switch handled by DSA is not seen as a special device with its own control interface but rather as an hardware accelerator for specific networking capabilities.
DSA has its own data plane where the switch ports are slave interfaces and the Ethernet controller connected to the SoC a master one. Tagging protocols are used to direct the frames to a specific port when coming from the SoC, as well as when received by the switch. For example, the RX path has an extra check after netif_receive_skb() so that if DSA is used, the frame can be tagged and reinjected into the network stack RX flow.
Finally, they talked about the relationship between DSA and Switchdev, and cross-chip configuration for interconnected switches. They also exposed the upcoming changes in DSA as well as long term goals.
As part of the network performances workshop, Jesper Dangaard Brouer presented memory bottlenecks in the allocators caused by specific network workloads, and how to deal with them. The SLAB/SLUB baseline performances are found to be too slow, particularly when using XDP. A way from a driver to solve this issue is to implement a custom page recycling mechanism and that’s what all high-speed drivers do. He then displayed some data to show why this mechanism is needed when targeting the 10G network budget.
Jesper is working on a generic solution called page pool and sent a first RFC at the end of 2016. As mentioned in the cover letter, it’s still not ready for inclusion and was only sent for early reviews. He also made a small overview of his implementation.
These two talks were given by Gilberto Bertin from Cloudflare and Martin Lau from Facebook. While they were not talking about device driver implementation or improvements in the network stack directly related to what we do at Free Electrons, it was nice to see how XDP is used in production.
XDP, the eXpress Data Path, provides a programmable data path at the lowest point of the network stack by processing RX packets directly out of the drivers’ RX ring queues. It’s quite new and is an answer to lots of userspace based solutions such as DPDK. Gilberto andMartin showed excellent results, confirming the usefulness of XDP.
From a driver point of view, some changes are required to support it. RX hooks must be added as well as some API changes and the driver’s memory model often needs to be updated. So far, in v4.10, only a few drivers are supporting XDP.
David S. Miller, the maintainer of the Linux networking stack and drivers, did an interesting keynote about XDP and eBPF. The eXpress Data Path clearly was the hot topic of this Netdev 2.1 conference with lots of talks related to the concept and David did a good overview of what XDP is, its purposes, advantages and limitations. He also quickly covered eBPF, the extended Berkeley Packet Filters, which is used in XDP to filter packets.
This presentation was a comprehensive introduction to the concepts introduced by XDP and its different use cases.
Netdev 2.1 was an excellent experience for us. The conference was well organized, the single track format allowed us to see every session on the schedule, and meeting with attendees and speakers was easy. The content was highly technical and an excellent opportunity to stay up-to-date with the latest changes of the networking subsystem in the kernel. The conference hosted both talks about in-kernel topics and their use in userspace, which we think is a very good approach to not focus only on the kernel side but also to be aware of the users needs and their use cases.
With 137 patches contributed, Free Electrons is the 18th contributing company according to the Kernel Patch Statistics. Free Electrons engineer Maxime Ripard appears in the list of top contributors by changed lines in the LWN statistics.
Our most important contributions to this release have been:
Support for Atmel platforms
Alexandre Belloni improved suspend/resume support for the Atmel watchdog driver, I2C controller driver and UART controller driver. This is part of a larger effort to upstream support for the backup mode of the Atmel SAMA5D2 SoC.
Alexandre Belloni also improved the at91-poweroff driver to properly shutdown LPDDR memories.
Boris Brezillon contributed a fix for the Atmel HLCDC display controller driver, as well as fixes for the atmel-ebi driver.
Support for Allwinner platforms
Boris Brezillon contributed a number of improvements to the sunxi-nand driver.
Mylène Josserand contributed a new driver for the digital audio codec on the Allwinner sun8i SoC, as well a the corresponding Device Tree changes and related fixes. Thanks to this driver, Mylène enabled audio support on the R16 Parrot and A33 Sinlinx boards.
Maxime Ripard contributed numerous improvements to the sunxi-mmc MMC controller driver, to support higher data rates, especially for the Allwinner A64.
Maxime Ripard contributed official Device Tree bindings for the ARM Mali GPU, which allows the GPU to be described in the Device Tree of the upstream kernel, even if the ARM kernel driver for the Mali will never be merged upstream.
Maxime Ripard contributed a number of fixes for the rtc-sun6i driver.
Maxime Ripard enabled display support on the A33 Sinlinx board, by contributing a panel driver and the necessary Device Tree changes.
Maxime Ripard continued his clean-up effort, by converting the GR8 and sun5i clock drivers to the sunxi-ng clock infrastructure, and converting the sun5i pinctrl driver to the new model.
Quentin Schulz added a power supply driver for the AXP20X and AXP22X PMICs used on numerous Allwinner platforms, as well as numerous Device Tree changes to enable it on the R16 Parrot and A33 Sinlinx boards.
Support for Marvell platforms
Grégory Clement added support for the RTC found in the Marvell Armada 7K and 8K SoCs.
Grégory Clement added support for the Marvell 88E6141 and 88E6341 Ethernet switches, which are used in the Armada 3700 based EspressoBin development board.
Romain Perier enabled the I2C controller, SPI controller and Ethernet switch on the EspressoBin, by contributing Device Tree changes.
Thomas Petazzoni contributed a number of fixes to the OMAP hwrng driver, which turns out to also be used on the Marvell 7K/8K platforms for their HW random number generator.
Thomas Petazzoni contributed a number of patches for the mvpp2 Ethernet controller driver, preparing the future addition of PPv2.2 support to the driver. The mvpp2 driver currently only supports PPv2.1, the Ethernet controller used on the Marvell Armada 375, and we are working on extending it to support PPv2.2, the Ethernet controller used on the Marvell Armada 7K/8K. PPv2.2 support is scheduled to be merged in 4.12.
Support for RaspberryPi platforms
Boris Brezillon contributed Device Tree changes to enable the VEC (Video Encoder) on all bcm283x platforms. Boris had previously contributed the driver for the VEC.
In addition to our direct contributions, a number of Free Electrons engineers are also maintainers of various subsystems in the Linux kernel. As part of this maintenance role:
Maxime Ripard, co-maintainer of the Allwinner ARM platform, reviewed and merged 85 patches from contributors
Alexandre Belloni, maintainer of the RTC subsystem and co-maintainer of the Atmel ARM platform, reviewed and merged 60 patches from contributors
Grégory Clement, co-maintainer of the Marvell ARM platform, reviewed and merged 42 patches from contributors
Boris Brezillon, maintainer of the MTD NAND subsystem, reviewed and merged 8 patches from contributors
Here is the detailed list of contributions, commit per commit:
Netdev 2.1 is the fourth edition of the technical conference on Linux networking. This conference is driven by the community and focus on both the kernel networking subsystems (device drivers, net stack, protocols) and their use in user-space.
This edition will be held in Montreal, Canada, April 6 to 8, and the schedule has been posted recently, featuring amongst other things a talk giving an overview and the current status display of the Distributed Switch Architecture (DSA) or a workshop about how to enable drivers to cope with heavy workloads, to improve performances.
At Free Electrons, we regularly work on networking related topics, especially as part of our Linux kernel contribution for the support of Marvell or Annapurna Labs ARM SoCs. Therefore, we decided to attend our first Netdev conference to stay up-to-date with the network subsystem and network drivers capabilities, and to learn from the community latest developments.
Our engineer Antoine Ténart will be representing Free Electrons at this event. We’re looking forward to being there!
Last month, five engineers from Free Electrons participated to the Embedded Linux Conference in Portlan, Oregon. It was once again a great conference to learn new things about embedded Linux and the Linux kernel, and to meet developers from the open-source community.
Free Electrons team at work at ELC 2017, with Maxime Ripard, Antoine Ténart, Mylène Josserand and Quentin Schulz
Of course, the slides from many other talks are progressively being uploaded, and the Linux Foundation published the video recordings in a record time: they are all already available on Youtube!
Below, each Free Electrons engineer who attended the conference has selected one talk he/she has liked, and gives a quick summary of the talk, hopefully to encourage you watch the corresponding video recording.
Using SWupdate to Upgrade your system, Gabriel Huau
Gabriel Huau from Witekio did a great talk at ELC about SWUpdate, a tool created by Denx to update your system. The talk gives an overview of this tool, how it is working and how to use it. Updating your system is very important for embedded devices to fix some bugs/security fixes or add new features, but in an industrial context, it is sometimes difficult to perform an update: devices not easily accessible, large number of devices and variants, etc. A tool that can update the system automatically or even Over The Air (OTA) can be very useful. SWUpdate is one of them.
SWUpdate allows to update different parts of an embedded system such as the bootloader, the kernel, the device tree, the root file system and also the application data.
It handles different image types: UBI, MTD, Raw, Custom LUA, u-boot environment and even your custom one. It includes a notifier to be able to receive feedback about the updating process which can be useful in some cases. SWUPdate uses different local and OTA/remote interfaces such as USB, SD card, HTTP, etc. It is based on a simple update image format to indicate which images must be updated.
Many customizations can be done with this tool as it is provided with the classic menuconfig configuration tool. One great thing is that this tool is supported by Yocto Project and Buildroot so it can be easily tested.
Khem Raj from Comcast is a frequent speaker at the Embedded Linux Conference, and one of his strong fields of expertise is C compilers, especially LLVM/Clang and Gcc. His talk at this conference can interest anyone developing code in the C language, to know about optimizations that the compilers can use to improve the performance or size of generated binaries. See the video and slides.
One noteworthy optimization is Clang’s -Oz (Gcc doesn’t have it), which goes even beyond -Os, by disabling loop vectorization. Note that Clang already performs better than Gcc in terms of code size (according to our own measurements). On the topic of bundle optimizations such as -O2 or -Os, Khem added that specific optimizations can be disabled in both compilers through the -fno- command line option preceding the name of a given optimization. The name of each optimization in a given bundle can be found through the -fverbose-asm command line option.
Another new optimization option is -Og, which is different from the traditional -g option. It still allows to produce code that can be debugged, but in a way that provides a reasonable level of runtime performance.
On the performance side, he also recalled the Feedback-Directed Optimizations (FDO), already covered in earlier Embedded Linux Conferences, which can be used to feed the compiler with profiler statistics about code branches. The compiler can use such information to optimize branches which are the more frequent at run-time.
Khem’s last advise was not to optimize too early, and first make sure you do your debugging and profiling work first, as heavily optimized code can be very difficult to debug. Therefore, optimizations are for well-proven code only.
Note that Khem also gave a similar talk in the IoT track for the conference, which was more focused on bare-metal code optimization code and portability: “Optimizing C for microcontrollers” (slides, video).
A Journey through Upstream Atomic KMS to Achieve DP Compliance, Manasi Navare
This talk was about the journey of a new comer in the mainline kernel community to fix the DisplayPort support in Intel i915 DRM driver. It first presented what happens from the moment we plug a cable in a monitor until we actually see an image, then where the driver is in the kernel: in the DRM subsystem, between the hardware (an Intel Integrated Graphics device) and the libdrm userspace library on which userspace applications such as the X server rely.
The bug to fix was that case when the driver would fail after updating to the requested resolution for a DP link. The other existing drivers usually fail before updating the resolution, so Manasi had to add a way to tell the userspace the DP link failed after updating the resolution. Such addition would be useless without applications using this new information, therefore she had to work with their developers to make the applications behave correctly when reading this important information.
With a working set of patches, she thought she had done most of the work with only the upstreaming left and didn’t know it would take her many versions to make it upstream. She wished to have sent a first version of a driver for review earlier to save time over the whole development plus upstreaming process. She also had to make sure the changes in the userspace applications will be ready when the driver will be upstreamed.
The talk was a good introduction on how DisplayPort works and an excellent example on why involving the community even in early stages of the development process may be a good idea to quicken the overall driver development process by avoiding complete rewriting of some code parts when upstreaming is under way.
Stephen did a great talk about one thing that is often overlooked, and really shouldn’t: Timekeeping. He started by explaining the various timekeeping mechanisms, both in hardware and how Linux use them. That meant covering the counters, timers, the tick, the jiffies, and the various POSIX clocks, and detailing the various frameworks using them. He also explained the various bugs that might be encountered when having a too naive counter implementation for example, or using the wrong POSIX clock from an application.
Karim did a very good introduction to Android Things. His talk was a great overview of what this new OS from Google targeting embedded devices is, and where it comes from. He started by showing the history of Android, and he explained what this system brought to the embedded market. He then switched to the birth of Android Things; a reboot of Google’s strategy for connected devices. He finally gave an in depth explanation of the internals of this new OS, by comparing Android Things and Android, with lots of examples and demos.
Android Things replaces Brillo / Weave, and unlike its predecessor is built reusing available tools and services. It’s in fact a lightweight version of Android, with many services removed and a few additions like the PIO API to drive GPIO, I2C, PWM or UART controllers. A few services were replaced as well, most notably the launcher. The result is a not so big, but not so small, system that can run on headless devices to control various sensors; with an Android API for application developers.