Hírolvasó

The 5.12 kernel has been released

4 év 3 hónap óta
Linus Torvalds has released the 5.12 kernel. "Thanks to everybody who made last week very calm indeed, which just makes me feel much happier about the final 5.12 release." Headline features in 5.12 include the removal of a number of obsolete, (mostly) 32-bit Arm subarchitectures, atomic instructions for BPF, conditional file lookups with LOOKUP_CACHED, support for zoned block devices in the Btrfs filesystem, threaded NAPI polling in the network stack, filesystem ID mapping, support for building the kernel with Clang link-time optimization, the KFENCE kernel-debugging tool, and more. See the LWN merge-window summaries (part 1, part 2) and the (in-progress) KernelNewbies 5.12 page for more information.
corbet

A letter from the UMN researchers

4 év 3 hónap óta
The University of Minnesota researchers who have stirred up the kernel community with various types of bad patches have sent an open letter to the linux-kernel list. "This current incident has caused a great deal of anger in the Linux community toward us, the research group, and the University of Minnesota. We apologize unconditionally for what we now recognize was a breach of the shared trust in the open source community and seek forgiveness for our missteps."
corbet

Paul E. Mc Kenney: Stupid RCU Tricks: The design of rcutorture

4 év 3 hónap óta
This installment of the rcutorture series takes a high-level look at its design. At the highest level, rcutorture is a stress test with a few unit-test components thrown in for good measure. It also includes scripts to handle both single-system and distributed testing. All of this code is of course paying homage to the many moods of Mr. Murphy.

The Many Moods of Mr. MurphyAs I have progressed through my career, I seem to have progressively miffed Mr. Murphy.

I completed my first professional (but pro bono) project in the mid-1970s. It had one user. Any million-year bugs it might have contained took the full million years to appear. This meant that Murphy was actually a pretty nice guy. Sure, whatever could happen would. Eventually. Maybe in geologic time.

In the 1980s, I completed a number of contract-programming projects that might have had installed bases of at many as 100 units. A million-year bug could be expected to appear about once per 10,000 years. In the 1990s, I worked on Sequent's DYNIX/ptx proprietary-UNIX operating system, which had an installed base of perhaps 6,000 systems. A million-year bug could be expected to appear not quite once per two centuries.

Shortly after the year 2000, I started working on the Linux kernel. There are at best rough estimates of the Linux kernel's installed based, and as of 2017, there were an estimated 20 billion systems of one sort of another running the Linux kernel, including smartphones, automobiles, household appliances, and much more. A million-year bug could be expected to appear more than once per hour across this huge installed base. In other words, over a period of about 40 years, Murphy has transitioned from being a pretty nice guy to being a total jerk!

Worse yet, should the Linux kernel capture even a modest fraction of the Internet-of-things market, a million-year bug could be expected to appear every few minutes across the installed base. Which might well result in Murphy becoming nothing less than a homicidal maniac.

Fortunately, there are some validation strategies that might help keep Murphy on the straight and narrow.

If You Cannot Beat Him, Join Him!Given that everything that can happen eventually will, the task at hand is to try to make it happen in the comparative comfort and safety of the lab. This means aiding and abetting Mr. Murphy, at least within the lab environment. And this is the whole point of rcutorture, whose tricks include the following:

  1. Temporal fuzzing.
  2. Exercising race conditions.
  3. Anticipating abuse.
Of course, none of these tricks are new, but it does not hurt to review them.

Temporal FuzzingBut why not go for the full effect and apply straight-up fuzzing? The answer to this question may be found in RCU's core API:
void rcu_read_lock(void); void rcu_read_unlock(void); void synchronize_rcu(void); void call_rcu(struct rcu_head *head, rcu_callback_t func); For the first three functions, there is nothing to fuzz, unless you are trying to test your compiler. For the last function, fuzzing of pointers—and most especially pointers to functions—is reserved for the truly brave and for those wishing to test their kernel's exception handling.

But it does make sense to fuzz the timing of calls to these functions, and that is exactly what rcutorture does. RCU readers and updaters are invoked at random times, with readers and updaters cooperating to detect any too-short grace periods, memory misordering, and so on. Much of the fuzzing is randomly generated at run time, but there are also module parameters that insert delays in specific locations. This strategy is straightforward, but can also be powerful, for example, careful choice of delays and other configuration settings decreased the mean time between failure (MTBF) of a memorable heisenbug from hundreds of hours to less than five hours. This had the beneficial effect of de-heisening this bug.

Exercising Race ConditionsMany of the most troublesome bugs involve rare operations, and one way to join forces with Murphy is to make rare operations less rare during validation. And rcutorture takes this approach often, including for the following operations:

  1. CPU hotplug.
  2. Transitions to and from idle, including transitions to and from the whole system being idle.
  3. Long RCU readers.
  4. Readers from interrupt handlers.
  5. Complex readers, for example, those overlapping with irq-disable regions.
  6. Delayed grace periods, for example, allowing a CPU to go offline and come back online during grace-period initialization.
  7. Racing call_rcu() invocations against rcu_barrier().
  8. Periodic forced migrations to other CPUs.
  9. Substantial testing of less-popular grace-period mechanisms.
  10. Processes running on the hypervisor to preempt code running in rcutorture guest OSes.
  11. Process exit.
  12. ”Near misses“ where the RCU grace-period guarantee is almost violated.
  13. Moving CPUs to and from rcu_nocbs callback-offloaded mode.
This exercising of race conditions might be reminiscent of the Netflix Chaos Monkey.

Anticipating AbuseThere are things that RCU users are not supposed to do. Just as users of the fork() system call are not supposed to code up forkbombs, RCU users are not supposed to code up endless blasts of call_rcu() invocations (see Documentation/RCU/checklist.rst item 8). Nevertheless, rcutorture does engage in (carefully limited forms of) call_rcu() abuse in order to find stress-related RCU bugs. This abuse is enabled by default and may be controlled by the rcutorture.fwd_progress module parameter and friends.

In addition, rcutorture inserts the occasional long-term delay in preemptible RCU readers and exercises code paths that must avoid deadlocks involving the scheduler and RCU.

Meta-Murphy, AKA Test the TestOf course, one danger of joining Murphy is that things can go wrong in test code just as easily as they can go wrong in the code under test.

For this reason, rcutorture provides the rcutorture.object_debug module parameter that verifies that the code checking for double call_rcu() invocations is working properly. In addition, the rcutorture.stall_cpu module parameter and friends may be used to force RCU CPU stall warning messages of various types.

The rcutorture tests of more fundamental RCU properties may be enabled by using the rcutorture.torture_type module parameter. For example, rcutorture.torture_type=busted selects a broken RCU implementation, which may also be selected using the BUSTED scenario. Either way, rcutorture had jolly well better complain about too-short grace periods. In addition, rcutorture.torture_type=busted_srcud forces rcutorture to run compound readers against SRCU, which does not support this notion. In this case also, rcutorture had better complain about too-short grace periods for these compound readers. The rcutorture.leakpointer module parameter tests the CONFIG_RCU_STRICT_GRACE_PERIOD Kconfig option's ability to detect pointers leaked from RCU read-side critical sections. Finally, the rcutorture tests of RCU priority boosting can themselves be tested by using the BUSTED-BOOST scenario, which must then complain about priority-boosting failures.

Additional unscheduled tests of rcutorture testing are of course provided by bugs in RCU itself. Perhaps these are rare examples of Murphy working against himself, but they normally do not feel that way at the time!

Enlisting DarwinThose who are willing to consider the possibility that natural selection applies to non-living objects might do well to consider validation such as that provided by rcutorture to be a selection function. Now, some developers might object to the thought that their carefully created changes are random mutations, but the sad fact is that long experience has often supported that view.

With this in mind, a good validation suite will select against bugs, resulting in robust software, right?

Wrong.

You see, bugs are a form of software. An undesirable form, perhaps, but a form nevertheless. Bugs will therefore adapt to any fixed validation suite and accumulate in your software, degrading its robustness. This means that any bugs located by end users must also be considered bugs against the validation suite, which after all failed to find those bugs. Modifying the validation suite to successfully find those bugs is therefore important, as is independent efforts to make the validation suite more capable. The hope is that modifying the test suite will make it more difficult for bugs to adapt to it.

But even that is insufficient. Blindly adding tests and test cases will eventually bloat your test suite to the point where it is no longer feasible to run all of it. It is therefore also necessary to review test cases and work out how to make them find bugs faster with less hardware, whether by merging tests, running more tests concurrently, or by more vigorously enlisting Mr. Murphy's assistance. It might also be necessary to eliminate test cases that are no longer relevant, for example, now that RCU no longer has a synchronize_rcu_bh(), there is no point in testing it.

In short, the price of robust software is eternal test development.

Matthew Garrett: An accidental bootsplash

4 év 3 hónap óta
Back in 2005 we had Debconf in Helsinki. Earlier in the year I'd ended up invited to Canonical's Ubuntu Down Under event in Sydney, and one of the things we'd tried to design was a reasonable graphical boot environment that could also display status messages. The design constraints were awkward - we wanted it to be entirely in userland (so we didn't need to carry kernel patches), and we didn't want to rely on vesafb[1] (because at the time we needed to reinitialise graphics hardware from userland on suspend/resume[2], and vesa was not super compatible with that). Nothing currently met our requirements, but by the time we'd got to Helsinki there was a general understanding that Paul Sladen was going to implement this.

The Helsinki Debconf ended being an extremely strange event, involving me having to explain to Mark Shuttleworth what the physics of a bomb exploding on a bus were, many people being traumatised by the whole sauna situation, and the whole unfortunate water balloon incident, but it also involved Sladen spending a bunch of time trying to produce an SVG of a London bus as a D-Bus logo and not really writing our hypothetical userland bootsplash program, so on the last night, fueled by Koff that we'd bought by just collecting all the discarded empty bottles and returning them for the deposits, I started writing one.

I knew that Debian was already using graphics mode for installation despite having a textual installer, because they needed to deal with more complex fonts than VGA could manage. Digging into the code, I found that it used BOGL - a graphics library that made use of the VGA framebuffer to draw things. VGA had a pre-allocated memory range for the framebuffer[3], which meant the firmware probably wouldn't map anything else there any hitting those addresses probably wouldn't break anything. This seemed safe.

A few hours later, I had some code that could use BOGL to print status messages to the screen of a machine booted with vga16fb. I woke up some time later, somehow found myself in an airport, and while sitting at the departure gate[4] I spent a while staring at VGA documentation and worked out which magical calls I needed to make to have it behave roughly like a linear framebuffer. Shortly before I got on my flight back to the UK, I had something that could also draw a graphical picture.

Usplash shipped shortly afterwards. We hit various issues - vga16fb produced a 640x480 mode, and some laptops were not inclined to do that without a BIOS call first. 640x400 worked basically everywhere, but meant we had to redraw the art because circles don't work the same way if you change the resolution. My brief "UBUNTU BETA" artwork that was me literally writing "UBUNTU BETA" on an HP TC1100 shortly after I'd got the Wacom screen working did not go down well, and thankfully we had better artwork before release.

But 16 colours is somewhat limiting. SVGALib offered a way to get more colours and better resolution in userland, retaining our prerequisites. Unfortunately it relied on VM86, which doesn't exist in 64-bit mode on Intel systems. I ended up hacking the X.org x86emu into a thunk library that exposed the same API as LRMI, so we could run it without needing VM86. Shockingly, it worked - we had support for 256 colour bootsplashes in any supported resolution on 64 bit systems as well as 32 bit ones.

But by now it was obvious that the future was having the kernel manage graphics support, both in terms of native programming and in supporting suspend/resume. Plymouth is much more fully featured than Usplash ever was, but relies on functionality that simply didn't exist when we started this adventure. There's certainly an argument that we'd have been better off making reasonable kernel modesetting support happen faster, but at this point I had literally no idea how to write decent kernel code and everyone should be happy I kept this to userland.

Anyway. The moral of all of this is that sometimes history works out such that you write some software that a huge number of people run without any idea of who you are, and also that this can happen without you having any fucking idea what you're doing.

Write code. Do crimes.

[1] vesafb relied on either the bootloader or the early stage kernel performing a VBE call to set a mode, and then just drawing directly into that framebuffer. When we were doing GPU reinitialisation in userland we couldn't guarantee that we'd run before the kernel tried to draw stuff into that framebuffer, and there was a risk that that was mapped to something dangerous if the GPU hadn't been reprogrammed into the same state. It turns out that having GPU modesetting in the kernel is a Good Thing.

[2] ACPI didn't guarantee that the firmware would reinitialise the graphics hardware, and as a result most machines didn't. At this point Linux didn't have native support for initialising most graphics hardware, so we fell back to doing it from userland. VBEtool was a terrible hack I wrote to try to re-execute the system's graphics hardware through a range of mechanisms, and it worked in a surprising number of cases.

[3] As long as you were willing to deal with 640x480 in 16 colours

[4] Helsinki-Vantaan had astonishingly comfortable seating for time

comments

[$] Avoiding unintended connection failures with SO_REUSEPORT

4 év 3 hónap óta
Many of us think that we operate busy web servers; LWN's server, for example, sweats hard when keeping up with the comment stream that accompanies any article mentioning the Rust programming language. But some organizations run truly busy servers and have to take some extraordinary measures to keep up with levels of traffic that even language advocates cannot create. The SO_REUSEPORT socket option is one of many features that have been added to the network stack to help these use cases. SO_REUSEPORT suffers from an implementation problem that can cause connections to fail, though. Kuniyuki Iwashima has posted a patch set addressing this problem, but there is some doubt as to whether it takes the right approach.
corbet

Security updates for Friday

4 év 3 hónap óta
Security updates have been issued by Debian (firefox-esr, openjdk-8, and wpa), openSUSE (irssi, jhead, opera, and python-django-registration), SUSE (firefox and qemu), and Ubuntu (dnsmasq and shibboleth-sp).
corbet

Initial Support for the riscv64 Architecture

4 év 3 hónap óta

With the following commit, Dale Rahn (drahn@) imported initial support for the 64-bit RISC-V architecture:

CVSROOT: /cvs Module name: src Changes by: drahn@cvs.openbsd.org 2021/04/22 20:42:17 Added files: sys/arch/riscv64: Makefile sys/arch/riscv64/compile: Makefile Makefile.inc sys/arch/riscv64/compile/GENERIC: Makefile sys/arch/riscv64/compile/RAMDISK: Makefile sys/arch/riscv64/conf: GENERIC Makefile.riscv64 RAMDISK files.riscv64 kern.ldscript sys/arch/riscv64/dev: mainbus.c mainbus.h plic.c plic.h riscv_cpu_intc.c riscv_cpu_intc.h simplebus.c simplebusvar.h timer.c timer.h sys/arch/riscv64/include: _float.h _types.h asm.h atomic.h bootconfig.h bus.h cdefs.h conf.h cpu.h cpufunc.h db_machdep.h disklabel.h elf.h endian.h exec.h fdt.h fenv.h frame.h ieee.h ieeefp.h intr.h kcore.h limits.h loadfile_machdep.h mutex.h param.h pcb.h pmap.h proc.h profile.h pte.h ptrace.h reg.h reloc.h riscv64var.h riscvreg.h sbi.h setjmp.h signal.h softintr.h spinlock.h syscall.h tcb.h timetc.h trap.h vmparam.h sys/arch/riscv64/riscv64: ast.c autoconf.c bus_dma.c bus_space.c conf.c copy.S copyinout.S copystr.S cpu.c cpufunc_asm.S cpuswitch.S db_disasm.c db_interface.c db_trace.c disksubr.c fpu.c genassym.cf intr.c locore.S locore0.S machdep.c mem.c pagezero.S pmap.c process_machdep.c sbi.c sig_machdep.c softintr.c support.S syscall.c trap.S trap_machdep.c vm_machdep.c Log message: Initial import of OpenBSD/riscv64 This work is based on the effort: https://www.openbsd.org/papers/Porting_OpenBSD_to_RISCV_FinalReport.pdf "Porting OpenBSD to RISC-V ISA" by Brian Bamsch <bbamsch@google.com> Wenyan He <wenyan.he@sjsu.edu> Mars Li <mengshi.li.mars@gmail.com> Shivam Waghela <shivamwaghela@gmail.com> With additional work by Dale Rahn <drahn@openbsd.org>

Congratulations and thanks to all involved!

A statement on the UMN mess

4 év 3 hónap óta
Speaking for the Linux Foundation Technical Advisory Board, Kees Cook has posted a brief statement on the controversy over patches submitted from the University of Minnesota.

The LF Technical Advisory Board is taking a look at the history of UMN's contributions and their associated research projects. At present, it seems the vast majority of patches have been in good faith, but we're continuing to review the work. Several public conversations have already started around our expectations of contributors.

Stay tuned for more.

corbet

Ubuntu 21.04 released

4 év 3 hónap óta
The Ubuntu 21.04 distribution release is available. "Today, Canonical released Ubuntu 21.04 with native Microsoft Active Directory integration, Wayland graphics by default, and a Flutter application development SDK. Separately, Canonical and Microsoft announced performance optimization and joint support for Microsoft SQL Server on Ubuntu."
corbet

[$] Toward signed BPF programs

4 év 3 hónap óta
The kernel's BPF virtual machine is versatile; it is possible to load BPF programs into the kernel to carry out a large (and growing) set of tasks. The growing body of BPF code can reasonably be thought of as kernel code in its own right. But, while the kernel can check signatures on loadable modules and prevent the loading of modules that are not properly signed, there is no such mechanism for BPF programs; any sufficiently privileged process can load any program that will pass the verifier. One might think that adding this checking for BPF would be straightforward, but that subsystem has some unique characteristics that make things more challenging than one might expect. There may be a solution in the works, though; fittingly, it works by loading yet another BPF program.
corbet

Security updates for Thursday

4 év 3 hónap óta
Security updates have been issued by Debian (thunderbird and wordpress), Fedora (curl, firefox, mediawiki, mingw-binutils, os-autoinst, and rpm-ostree), Oracle (java-1.8.0-openjdk and java-11-openjdk), SUSE (kernel, pcp, and tomcat6), and Ubuntu (linux, linux-aws, linux-gke-5.3, linux-hwe, linux-kvm, linux-lts-xenial, linux-oem-5.6, linux-raspi2-5.3, linux-snapdragon).
corbet

[$] Intentionally buggy commits for fame—and papers

4 év 3 hónap óta
A buggy patch posted to the linux-kernel mailing list in early April was apparently the last straw for Greg Kroah-Hartman as it led to the planned reversion of a whole slew of commits with one thing in common: their origin at the University of Minnesota (UMN). The patch to the NFSv4 authorization mechanism was duly questioned by two NFS developers, but it is not an honest mistake; according to Kroah-Hartman, there has been an attack of sorts underway as part of some academic research at the university. In order to be sure that these intentional bugs, many with security implications, do not continue to haunt Linux, he is working on reverting commits that came from email addresses with the umn.edu domain.
jake

Security updates for Wednesday

4 év 3 hónap óta
Security updates have been issued by Debian (firefox-esr, php-pear, wordpress, and zabbix), Oracle (java-1.8.0-openjdk and java-11-openjdk), Red Hat (java-1.8.0-openjdk, java-11-openjdk, kernel, and kpatch-patch), Scientific Linux (java-1.8.0-openjdk and java-11-openjdk), Slackware (seamonkey), SUSE (apache-commons-io, ImageMagick, kvm, ruby2.5, and sudo), and Ubuntu (edk2, libcaca, ntp, and ruby2.3, ruby2.5, ruby2.7).
ris

[$] Rust heads into the kernel?

4 év 3 hónap óta
In a lengthy message to the linux-kernel mailing list, Miguel Ojeda "introduced" the Rust for Linux project. It was likely not the first time that most kernel developers had heard of the effort; there was an extensive discussion of the project at the 2020 Linux Plumbers Conference, for example. It has also been raised before on the list. Now, the project is looking for feedback from the kernel community about its plans, thus the RFC posting on April 14.
jake

In the trenches with Thomas Gleixner (Linux.com)

4 év 3 hónap óta
Linux.com has published an interview with Thomas Gleixner with a focus on the realtime preemption work. "The approach to funding these kinds of projects reminds me of the Mikado Game, which is popular in Europe, where the first player who picks up the stick and disturbs the pile often is the one who loses. That’s puzzling to me, especially as many companies build key products depending on these technologies and seem to take the availability and sustainability for granted up to the point where such a project fails, or people stop working on it due to lack of funding. Such companies should seriously consider supporting the funding of the Real-Time project."
corbet

Security updates for Tuesday

4 év 3 hónap óta
Security updates have been issued by Debian (xorg-server), Fedora (CImg, gmic, leptonica, mingw-binutils, mingw-glib2, mingw-leptonica, mingw-python3, nodejs, and seamonkey), openSUSE (irssi, kernel, nextcloud-desktop, python-django-registration, and thunderbird), Red Hat (389-ds:1.4, kernel, kernel-rt, perl, and pki-core:10.6), SUSE (kernel, sudo, and xen), and Ubuntu (clamav and openslp-dfsg).
ris

[$] Btrfs on zoned block devices

4 év 3 hónap óta
Zoned block devices have some unfamiliar characteristics that result from compromises made in the name of higher storage density. They are divided into zones, some or all of which do not support random access for write operations. Instead, these "sequential" zones can only be written in order, from the first block to the last. This constraint poses a new challenge for filesystems, which are normally designed with the assumption that storage blocks can be written in any order. It is thus not surprising that zoned-device support in mainstream filesystems in Linux has been slow in coming; that is changing, though, with the addition of support for zoned block devices to Btrfs in Linux 5.12.

corbet

OpenSSH 8.6 released

4 év 3 hónap óta
OpenSSH 8.6 is now available. The "ssh-rsa" signature scheme, which uses the SHA-1 hash algorithm, will be disabled by default in the near future. "Note that the deactivation of "ssh-rsa" signatures does not necessarily require cessation of use for RSA keys. In the SSH protocol, keys may be capable of signing using multiple algorithms. In particular, "ssh-rsa" keys are capable of signing using "rsa-sha2-256" (RSA/SHA256), "rsa-sha2-512" (RSA/SHA512) and "ssh-rsa" (RSA/SHA1). Only the last of these is being turned off by default."
ris