Back from Netdev 0x17

At Bootlin, we focus on Embedded Linux development and support, and these embedded devices often have a network interface, be-it an Ethernet port, a Wireless chip or some other kind of communication channel that falls under the Linux Networking Stack’s framework.

So it’s always interesting to see what the rest of the community is working on, and meet in real life people we interact with on the netdev mailing list.

That’s why this year, Alexis LothorĂ© and Maxime Chevallier flew to Vancouver to participate to the Netdev Conference, a 5 days event organised by the Netdev Society, a small non-profit run by volunteers dedicated to holding this event.

Bootlin at Netdev

Most talks at Netdev are not directly covering topics we’re actively working on, but it’s always refreshing to see these new exciting technologies that could trickle their way down to the embedded world a few years from now. It is also always pretty interesting to stay up to date about challenges encountered by other parts of the networking industry, at scales way different than the ones we are used to.

We learned for example what CXL is about, what it brings and the effort that are made to design new networking hardware around this technology to change the way we think about datacenter networking.

When we attended Netdev 0x13 in 2019, QUIC was one of the hot topics. This year, Homa was under the spotlight with talks on what it is, and how this new protocol could address some of TCP’s problems.

Like all previous editions, we learned all the progress that were made with TC and its future, new ways of bypassing the kernel stack, BPF integration in the kernel, along with XDP which continued to be more and more powerful.

Another hot topic in the kernel is the introduction of the Rust language, and the network subsystem is a pretty relevant target for the new features brought by the language. As a consequence, Rust subsystem maintainers Miguel Ojeda and Wedson Almeida Filho gave an overview of Rust benefits compared to traditional C code, and then showed a step-by-step implementation of a kernel-side TCP server module. While this example is not perfectly representative of classic network-related drivers we usually write, it was a nice showcase of current state of kernel APIs abstractions in Rust.

We also discovered the new use-case that is now driving most of the datacenter networking efforts, which is without surprise AI and Machine Learning. Turns out, if you want your ChatGPT to answer up-to-date replies without having to wait for too long, you need a powerful and well-organized datacenter for the training part, and networking engineering takes a big part in it to keep all those GPUs fed at a relevant pace.

This lead to the devmem TCP effort, which started to feel a bit familiar for us as it uses dma-buf, which we also sometimes use on multimedia pipelines. The ML and AI topic was introduced to us by the wonderful Keynote session given by Manya Ghobadi, who got all the audience captivated by how AI and ML works, what AI workloads requires in terms of network traffic scheduling, datacenter topology and computing hardware that uses optical computing.

On the final day, we even had a visit from Jakub Kicinski (one of the co-maintainers of Linux networking tree), presenting what he had been working on, and gave us an update on the netdev development statistics (and basically, his main point is that we do need to review more patches).

For the first time, there was a talk from Bootlin at netdev, as Maxime presented one of the topics he’s been working on lately : Improving multi-PHY and multi-port interfaces support. Although it was one of the only talks focusing on the low-levels aspects of the Ethernet stack, it triggered some discussions and interest from the community, which will help further improving the ongoing work.

The slides and videos of the event will be published at some point in the future, we will for sure mention this to our readers when it becomes available.

We’ll conclude this short feedback by thanking once again the Netdev Board members, organizers, speakers and the audience for this great event.

We’ll come back 🙂

Bootlin at Netdev 0x17, THE Technical Conference on Linux Networking

VancouverBootlin will be at the Netdev 0x17 conference, subtitled THE Technical Conference on Linux Networking. It is indeed one of the major event for developers working on the networking side of the Linux kernel to gather and discuss current and future topics. This year, the conference will take place from Oct 30 to Nov 3 in Vancouver, Canada.

Bootlin is involved in a number of Linux kernel networking developments: development and/or improvement of Linux kernel drivers for Ethernet MACs, Ethernet PHYs, WiFi chips, support for SFP, for Ethernet switches, for PTP offloading, for MACsec offloading, improvements to the 802.15.4 stack, and more. As such, it is very relevant for us to meet the Linux kernel networking community, present our work, and understand where things are heading to in the networking stack.

Our engineers Maxime Chevallier and Alexis Lothoré will both attend the conference. In addition, Maxime will be presenting a talk titled Improving multi-phy and multi-port interfaces:

This talk will describe current use-cases where one MAC is connected to multiple PHYs (chained, or in parallel) and multiple front-facing ports, either through multiple PHYs or through a single multi-port PHY. There exist support for some of these scenarios already, but it is limited by the fact that the PHY device is hidden behind a net_device from userspace’s point of view. We therefore can’t configure an individual PHY when multiple PHYs are present on a link (through SFP transceivers for example), and selecting which front-facing port to use is also limited. This talk will describe ongoing work to support these complex topologies, the challenges faced and expected improvements.

We look forward to attending this event in a few weeks time!

Feedback from the Netdev 0x13 conference

The Netdev 0x13 conference took place last week in Prague, Czech Republic. As we work on a variety of networking topics as part of our Linux kernel contributions, Bootlin engineers Maxime Chevallier and Antoine TĂ©nart went to meet with the Linux networking community and to see a lot of interesting sessions. It’s the third time we enjoy attending the Netdev conference (after Netdev 2.1 and Netdev 2.2) and as always, it was a blast!

The 3-day conference started with a first day of workshops and tutorials. We enjoyed learning how to be the cool kids thanks to the XDP hands-on tutorial where Jesper Brouer and Toke Høiland-Jørgensen cooked us a number of lessons to progressively get to learn how to write and load XDP programs. This was the first trial-run of the tutorial which is meant to be extended and used as a material to go through the XDP basics. The instructions are all available on Github.

We then had the chance to attend the TC workshop where face to face discussions and presentations of the traffic control hot topics being worked on happened. The session caught our attention as the topic is related to current subjects being worked on at Bootlin.

Being used to work on embedded systems, seeing the problems the Network developers face can sometimes come as a surprise. During the TC workshop, Vlad Buslov presented his recent work on removing TC flower’s the dependency to the global rtnl lock, which is an issue when you have a million classification rules to update quickly.

We also went to the hardware offload workshop. The future of the network offload APIs and support in the Linux kernel was discussed, with various topics ranging from ASIC support to switchev advanced use-cases or offloading XDP. This was very interesting to us as we do work on various networking engines providing many offloading facilities to the kernel.

The next two days were a collection of talks presenting the recent advances in the networking subsystem of the Linux kernel, as well as current issues and real-world examples of recent functionalities being leveraged.

As always XDP was brought-up with a presentation of XDP offloading using virtio-net, recent advances in combining XDP and hardware offloading techniques and a feedback from Cloudflare using XDP in their DDOS mitigation in-house solution.

But we also got to see other topics, such as SO_TIMESTAMPING being used for performance analytics. In this talk Soheil Hassas Yeganeh presented how the kernel timestamping facilities can be used to track individual packets withing the networking stack for performance analysis and debugging. This was nice to see as we worked on enabling hardware timestamping in networking engines and PHYs for our clients.

Another hot topic this year was the QUIC protocol, which was presented in details in the very good QUIC tutorial by Jana Iyengar. Since this protocol is fairly new, it was brought-up in several sessions from a lot of interesting angles.

Although QUIC was not the main subject of Alissa Cooper’s keynote on Open Source, the IETF, and You, she explained how QUIC was an example of a protocol that is designed alongside its implementations, having a tight feedback loop between the protocol specifications and its usage in real-life. Alissa shared Jana’s point on how middle-boxes are a problem when designing and deploying new protocols, and explained that an approach to overcome this “ossification” is to encrypt the protocol header themselves and document the invariant parts of the non-encrypted parts.

A consequence of having a flexible protocol is that it is not meant to be implemented in the kernel. However, Maciej Machnikowski and Joshua Hay explained that it is still possible to offload some of the processing to hardware, which sparked interesting discussions with the audience on how to do so.

Conclusion

The Netdev 0x13 conference was well organized and very pleasant to attend. The content was deeply technical and allowed us to stay up-to-date with the latest developments. We also had interesting discussions and came back with lots of ideas to explore.

Thanks for organizing Netdev, we had an amazing time!

Feedback from the Netdev 2.2 conference

The Netdev 2.2 conference took place in Seoul, South Korea. As we work on a diversity of networking topics at Bootlin as part of our Linux kernel contributions, Bootlin engineers Alexandre Belloni and Antoine TĂ©nart went to Seoul to attend lots of interesting sessions and to meet with the Linux networking community. Below, they report on what they learned from this conference, by highlighting two talks they particularly liked.

Linux Networking Dietary Restrictions — slides — video

David S. Miller gave a keynote about reducing the size of core structures in the Linux kernel networking core. The idea behind his work is to use smaller structures which has many benefits in terms of performance as less cache misses will occur and less memory resources are needed. This is especially true in the networking core as small changes may have enormous impacts and improve performance a lot. Another argument from his maintainer hat perspective is the maintainability, where smaller structures usually means less complexity.

He presented five techniques he used to shrink the networking core data structures. The first one was to identify members of common base structures that are only used in sub-classes, as these members can easily be moved out and not impact all the data paths.

The second one makes use of what David calls “state compression”, aka. understanding the real width of the information stored in data structures and to pack flags together to save space. In his mind a boolean should take a single bit whereas in the kernel it requires way more space than that. While this is fine for many uses it makes sense to compress all these data in critical structures.

Then David S. Miller spoke about unused bits in pointers where in the kernel all pointers have 3 bits never used. He argued these bits are 3 boolean values that should be used to reduce core data structure sizes. This technique and the state compression one can be used by introducing helpers to safely access the data.

Another technique he used was to unionize members that aren’t used at the same time. This helps reducing even more the structure size by not having areas of memory never used during identified steps in the networking stack.

Finally he showed us the last technique he used, which was using lookup keys instead of pointers when the objects can be found cheaply based on their index. While this cannot be used for every object, it helped reducing some data structures.

While going through all these techniques he gave many examples to help understanding what can be saved and how it was effective. This was overall a great talk showing a critical aspect we do not always think of when writing drivers, which can lead to big performance improvements.

David S. Miller at Nedev 2.2

WireGuard: Next-generation Secure Kernel Network Tunnel — slides — video

Jason A. Donenfeld presented his new and shiny L3 network tunneling mechanism, in Linux. After two years of development this in-kernel formally proven cryptographic protocol is ready to be submitted upstream to get the first rounds of review.

The idea behind Wireguard is to provide, with a small code base, a simple interface to establish and maintain encrypted tunnels. Jason made a demo which was impressive by its simplicity when securely connecting two machines, while it can be a real pain when working with OpenVPN or IPsec. Under the hood this mechanism uses UDP packets on top of either IPv4 and IPv6 to transport encrypted packets using modern cryptographic principles. The authentication is similar to what SSH is using: static private/public key pairs. One particularly nice design choice is the fact that Wireguard is exposed as a stateless interface to the administrator whereas the protocol is stateful and timer based, which allow to put devices into sleep mode and not to care about it.

One of the difficulty to get Wireguard accepted upstream is its cryptographic needs, which do not match what can provide the kernel cryptographic framework. Jason knows this and plan to first send patches to rework the cryptographic framework so that his module nicely integrates with in-kernel APIs. First RFC patches for Wireguard should be sent at the end of 2017, or at the beginning of 2018.

We look forward to seeing Wireguard hit the mainline kernel, to allow everybody to establish secure tunnels in an easy way!

Jason A. Donenfeld at Netdev 2.2

Conclusion

Netdev 2.2 was again an excellent experience for us. It was an (almost) single track format, running alongside the workshops, allowing to not miss any session. The technical content let us dive deeply in the inner working of the network stack and stay up-to-date with the current developments.

Thanks for organizing this and for the impressive job, we had an amazing time!

Bootlin at NetDev 2.2

NetDev 2.2Back in April 2017, Bootlin engineer Antoine TĂ©nart participated to NetDev 2.1, the most important conference discussing Linux networking support. After the conference, Antoine published a summary of it, reporting on the most interesting talks and topics that have been discussed.

Next week, NetDev 2.2 takes place in Seoul, South Korea, and this time around, two Bootlin engineers will be attending the event: Alexandre Belloni and Antoine TĂ©nart. We are getting more and more projects with networking related topics, and therefore the wide range of talks proposed at NetDev 2.2 will definitely help grow our expertise in this field.

Do not hesitate to get in touch with Alexandre or Antoine if you are also attending this event!

Feedback from the Netdev 2.1 conference

At Bootlin, we regularly work on networking topics as part of our Linux kernel contributions and thus we decided to attend our very first Netdev conference this year in Montreal. With the recent evolution of the network subsystem and its drivers capabilities, the conference was a very good opportunity to stay up-to-date, thanks to lots of interesting sessions.

Eric Dumazet presenting “Busypolling next generation”

The speakers and the Netdev committee did an impressive job by offering such a great schedule and the recorded talks are already available on the Netdev Youtube channel. We particularly liked a few of those talks.

Distributed Switch Architecture – slidesvideo

Andrew Lunn, Viven Didelot and Florian Fainelli presented DSA, the Distributed Switch Architecture, by giving an overview of what DSA is and by then presenting its design. They completed their talk by discussing the future of this subsystem.

DSA in one slide

The goal of the DSA subsystem is to support Ethernet switches connected to the CPU through an Ethernet controller. The distributed part comes from the possibility to have multiple switches connected together through dedicated ports. DSA was introduced nearly 10 years ago but was mostly quiet and only recently came back to life thanks to contributions made by the authors of this talk, its maintainers.

The main idea of DSA is to reuse the available internal representations and tools to describe and configure the switches. Ports are represented as Linux network interfaces to allow the userspace to configure them using common tools, the Linux bridging concept is used for interface bridging and the Linux bonding concept for port trunks. A switch handled by DSA is not seen as a special device with its own control interface but rather as an hardware accelerator for specific networking capabilities.

DSA has its own data plane where the switch ports are slave interfaces and the Ethernet controller connected to the SoC a master one. Tagging protocols are used to direct the frames to a specific port when coming from the SoC, as well as when received by the switch. For example, the RX path has an extra check after netif_receive_skb() so that if DSA is used, the frame can be tagged and reinjected into the network stack RX flow.

Finally, they talked about the relationship between DSA and Switchdev, and cross-chip configuration for interconnected switches. They also exposed the upcoming changes in DSA as well as long term goals.

Memory bottlenecks – slides

As part of the network performances workshop, Jesper Dangaard Brouer presented memory bottlenecks in the allocators caused by specific network workloads, and how to deal with them. The SLAB/SLUB baseline performances are found to be too slow, particularly when using XDP. A way from a driver to solve this issue is to implement a custom page recycling mechanism and that’s what all high-speed drivers do. He then displayed some data to show why this mechanism is needed when targeting the 10G network budget.

Jesper is working on a generic solution called page pool and sent a first RFC at the end of 2016. As mentioned in the cover letter, it’s still not ready for inclusion and was only sent for early reviews. He also made a small overview of his implementation.

DDOS countermeasures with XDP – slides #1, slides #2 – video #1, video #2

These two talks were given by Gilberto Bertin from Cloudflare and Martin Lau from Facebook. While they were not talking about device driver implementation or improvements in the network stack directly related to what we do at Bootlin, it was nice to see how XDP is used in production.

XDP, the eXpress Data Path, provides a programmable data path at the lowest point of the network stack by processing RX packets directly out of the drivers’ RX ring queues. It’s quite new and is an answer to lots of userspace based solutions such as DPDK. Gilberto andMartin showed excellent results, confirming the usefulness of XDP.

From a driver point of view, some changes are required to support it. RX hooks must be added as well as some API changes and the driver’s memory model often needs to be updated. So far, in v4.10, only a few drivers are supporting XDP.

XDP MythBusters – slides – video

David S. Miller, the maintainer of the Linux networking stack and drivers, did an interesting keynote about XDP and eBPF. The eXpress Data Path clearly was the hot topic of this Netdev 2.1 conference with lots of talks related to the concept and David did a good overview of what XDP is, its purposes, advantages and limitations. He also quickly covered eBPF, the extended Berkeley Packet Filters, which is used in XDP to filter packets.

This presentation was a comprehensive introduction to the concepts introduced by XDP and its different use cases.

Conclusion

Netdev 2.1 was an excellent experience for us. The conference was well organized, the single track format allowed us to see every session on the schedule, and meeting with attendees and speakers was easy. The content was highly technical and an excellent opportunity to stay up-to-date with the latest changes of the networking subsystem in the kernel. The conference hosted both talks about in-kernel topics and their use in userspace, which we think is a very good approach to not focus only on the kernel side but also to be aware of the users needs and their use cases.

Bootlin at the Netdev 2.1 conference

Netdev 2.1 is the fourth edition of the technical conference on Linux networking. This conference is driven by the community and focus on both the kernel networking subsystems (device drivers, net stack, protocols) and their use in user-space.

This edition will be held in Montreal, Canada, April 6 to 8, and the schedule has been posted recently, featuring amongst other things a talk giving an overview and the current status display of the Distributed Switch Architecture (DSA) or a workshop about how to enable drivers to cope with heavy workloads, to improve performances.

At Bootlin, we regularly work on networking related topics, especially as part of our Linux kernel contribution for the support of Marvell or Annapurna Labs ARM SoCs. Therefore, we decided to attend our first Netdev conference to stay up-to-date with the network subsystem and network drivers capabilities, and to learn from the community latest developments.

Our engineer Antoine TĂ©nart will be representing Bootlin at this event. We’re looking forward to being there!