Hírolvasó
Gawk 5.2.0 released
Security updates for Tuesday
OpenWrt 22.03.0 released
g2k22 Hackathon Report: Martijn van Duren on snmpd(8) improvements
We are delighted to have received a report on the recently-concluded g2k22 hackathon. Martijn van Duren (martijn@) writes:
Coming to Bad Liebenzell for the 3rd year in a row I knew what to expect, but the scenery still continues to amaze me. Driving through the black forest was a nice little escape before plunging back into the SNMP world.
One of the biggest misconceptions I've seen floating around and one of my biggest irks with snmpd(8) was its privilege separation situation. While true that snmpd(8) always had multiple processes it was never used to any meaningful degree. The engine process (snmpe) handled everything snmp related: Handling packets/connections, de-/encoding the BER, handling authentication, finding the correct object and retrieving the data from the proper source (usually the kernel). Because some metrics fell outside the scope of pledge it also ran without the pledge seat belt. The engine however does run inside a /var/empty chroot, this is where the other (parent) process comes into play. When a trap (notification) is received and covered by "trap handle" it's forwarded to the parent process, which then executes the "command".
[$] Concurrent page-fault handling with per-VMA locks
Seven new stable kernels
Security updates for Monday
Kernel prepatch 6.0-rc4
Beyond the usual fixes, 6.0-rc4 includes one feature change: a hook to allow security modules to control access to the io_uring command pass-through mechanism. See this article for the background behind this late-arriving change.
Peter Eckersley RIP
[...] Toward the end of his life, Peter focused his career on ethics and safety of artificial intelligence, and he founded the AI Objectives Institute to examine the concrete parallels he saw between surprising and undesirable outcomes that can emerge within economies and those that can emerge in machine learning systems.
More about Eckersley can be found at his web site, on his Wikipedia page, and in a Hacker News discussion.
Arti 1.0.0 released
When we defined our set of milestones, we defined Arti 1.0.0 as "ready for production use": You should be able to use it in the real world, to get a similar degree of privacy, usability, and stability to what you would with a C client Tor. The APIs should be (more or less) stable for embedders.
We believe we have achieved this. You can now use arti proxy to connect to the Tor network to anonymize your network connections.
Linux Plumbers Conference: LPC 2022 Open for Last-Minute On-Site Registrations
Against all expectations, we have now worked through the entire waitlist, courtesy of some last-minute cancellations due to late-breaking corporate travel restrictions. We are therefore re-opening general registration. So if you can somehow arrange to be in Dublin on September 12-14 at this late date, please register.
Virtual registration never has closed, and is still open.
Either way, we look forward to seeing you there!
[$] What's in a (type) name?
Security updates for Friday
OpenBSD may soon gain further memory protections: immutable userland mappings
In the last few years, I have been improving the strictness of userland memory layout. An example is the recent addition of MAP_STACK and msyscall(). The first one marks pages that are stack, so that upon entry to the kernel we can check if the stack-pointer is pointing in the stack range. If it isn't, the most obvious conclusion is that a ROP pivot has occured, and we kills the process. The second one marks the region which contains syscall traps, if upon entry to the kernel the PC is not in that region, we know somone is trying to do system calls via an unapproved method.
ps(1) gains support for tree-like display of processes
Following a discussion on tech@, Job Snijders (job@), committed to ps(1) support for displaying the parent/child hierarchy of processes as an ASCII art tree:
CVSROOT: /cvs Module name: src Changes by: job@cvs.openbsd.org 2022/09/01 15:15:54 Modified files: bin/ps : extern.h print.c ps.1 ps.c ps.h Log message: Add forest (-f) mode In -f mode group & display parent/child process relationships using ASCII art. Borrows heavily from Brian Somers' work on FreeBSD ps(1). With input from deraadt@ and tb@ OK benno@ claudio@rcctl(8) gains a "configtest" action
Antoine Jacoutot (ajacoutot@) has added a "configtest" action to rcctl(8):
CVSROOT: /cvs Module name: src Changes by: ajacoutot@cvs.openbsd.org 2022/09/01 01:25:32 Modified files: etc/rc.d : rc.subr share/man/man8 : rc.d.8 usr.sbin/rcctl : rcctl.sh Log message: Add a new action: "configtest", to check configuration syntax of the daemon. A few adjustments will be done in the next days (like disabling this action if there's no specific rc_configtest function defined). e.g. /etc/rc.d/sshd configtest rcctl configtest sshd idea from naddy@This is a feature that sysadmin types have been wanting for quite a while. A consistent way to sanity check your config before loading in anger is certain to make OpenBSD users' lives better.
Portable OpenSSH commits now SSH-signed
Damien Miller (djm@) notes that all (new) commits to the portable OpenSSH repository are now signed using git's SSH signature support.
Further details are on the OpenSSH development mailing list:
[…] We are in the process of converting the portable OpenSSH repository to require signed commits, tags and pushes, using git's recent ssh signature support. So far it's gone very smoothly, and we hope to have it enforced for all commits soon. We maintain our own git repository for portable OpenSSH, that is automatically mirrored to github. We use "pre-receive" and "update" hooks to check for signed pushes and tags/commits respectively, using an in-repository allowed_signers file. […]This is a most welcome process integrity improvement that hopefully will make the world trust our favorite SSH software even more.
Paul E. Mc Kenney: Confessions of a Recovering Proprietary Programmer, Part XIX: Concurrent Computational Models
TL;DR: In complete consonance with a depressingly large fraction of 21st-century discourse, they are talking right past each other.
On the Model of Computation: Point: We Must Extend Our Model of Computation to Account for Cost and LocationDally makes a number of good points. First, he notes that past decades have seen a shift from computation being more expensive than memory references to memory references being more expensive than computation, and by a good many orders of magnitude. Second, he points out that in production, constant factors can be more important than asymptotic behavior. Third, he describes situations where hardware caches are not a silver bullet. Fourth, he calls out the necessity of appropriate computational models for performance programming, though to his credit, he does clearly state that not all programming is performance programming. Fifth, he calls out the global need for reduced energy expended on computation, motivated by projected increases in computation-driven energy demand. Sixth and finally, he advocates for a new computational model that builds on PRAM by (1) assigning different costs to computation and to memory references and (1) making memory-reference costs a function of a distance metric based on locations assigned to processing units and memories.
On the Model of Computation: Counterpoint: Parallel Programming Wall and Multicore Software Spiral: Denial Hence CrisisVishkin also makes some interesting points. First, he calls out the days of Moore's-Law frequency scaling, calling for the establishing of a similar “free lunch” scaling in terms of numbers of CPUs. He echoes Dally's point about not all programming being performance programming, but argues that in addition to a “memory wall” and an “energy wall”, there is also a “parallel programming wall” because programming multicore systems is in “simply too difficult”. Second, Vishkin calls on hardware vendors to take more account of ease of use when creating computer systems, including systems with GPUs. Third, Vishkin argues that compilers should take up much of the burden of squeezing performance out of hardware. Fourth, he makes several arguments that concurrent systems should adapt to PRAM rather than adapting PRAM to existing hardware. Fifth and finally, he calls for funding for a “multicore software spiral” that he hopes will bring back free-lunch performance increases driven by Moore's Law, but based on increasing numbers of cores rather than increasing clock frequency.
DiscussionFigure 2.3 of Is Parallel Programming Hard, And, If So, What Can You Do About It? (AKA “perfbook”) shows the “Iron Triangle” of concurrent programming. Vishkin focuses primarily at the upper edge of this triangle, which is primarily about productivity. Dally focuses primarily on the lower left edge of this triangle, which is primarily about performance. Except that Vishkin's points about the cost of performance programming are all too valid, which means that defraying the cost of performance programming in the work-a-day world requires very large numbers of users, which is often accomplished by making performance-oriented concurrent software be quite general, thus amortizing its cost over large numbers of use cases and users. This puts much performance-oriented concurrent software at the lower tip of the iron triangle, where performance meets generality.
One could argue that Vishkin's proposed relegation of performance-oriented code to compilers is one way to break the iron triangle of concurrent programming, but this argument fails because compilers are themselves low-level software (thus at the lower tip of the triangle). Worse yet, there have been many attempts to put the full concurrent-programming burden on the compiler (or onto transactional memory), but rarely to good effect. Perhaps SQL is the exception that proves this rule, but this is not an example that Vishkin calls out.
Both Dally and Vishkin correctly point to the ubiquity of multicore systems, so it only reasonable to look at how these systems are handled in practice. After all, it is only fair to check their apparent assumption of “badly”. And Section 2.3 of perfbook lists several straightforward approaches: (1) Using multiple instances of a sequential application, (2) Using the large quantity of existing parallel software, ranging from operating-system kernels to database servers (SQL again!) to numerical software to image-processing libraries to machine-learning libraries, to name but a few of the areas that Python developers could teach us about, and (3) Applying sequential performance optimizations such that single-threaded performance suffices. Those taking door 1 or door 2 will need only a few parallel performance programmers, and it seems to have proven eminently feasible to have those few to use a sufficiently accurate computational model. The remaining users simply invoke the software produced by these parallel performance programmers, and need not worry much about concurrency. The cases where none of these three doors apply are more challenging, but surmounting the occasional challenge is part of the programming game, not just the parallel-programming game.
One additional historical trend is the sharply decreasing cost of concurrent systems, both in terms of purchase price and energy consumption. Where an entry-level late-1980s parallel system cost millions of dollars and consumed kilowatts, an entry-level 2022 system can be had for less than $100. This means that there is no longer an overwhelming economic imperative to extract every ounce of performance from most year-2022 concurrent systems. For example, much smartphone code attains the required performance while running single-threaded, which means that additional performance from parallelism could well be a waste of energy. Instead, the smartphone automatically powers down the unused hardware, thus providing only the performance that is actually needed, and does so in an energy-efficient manner. The occasional heavy-weight application might turn on all the CPUs, but only such applications need the ministrations of parallel performance programmers and complex computational models.
Thus, Dally and Vishkin are talking past each other, describing different types of software that is produced by different types of developers, who can each use whatever computational model is appropriate to the task at hand.
Which raises the question of whether there might be a one-size-fits-all computational model. I have my doubts. In my 45 years of supporting myself and (later) my family by developing software, I have seen many models, frameworks, languages, and libraries come and go. In contrast, to the best of my knowledge, the speed of light and the sizes of the various types of atoms have remained constant. Don't get me wrong, I do hope that the trend towards vertical stacking of electronics within CPU chips will continue, and I also hope that this trend will result in smaller systems that are correspondingly less constrained by speed-of-light and size-of-atoms limitations. However, the laws of physics appear to be exceedingly stubborn things, and we would therefore be well-advised to work together. Dally's attempts to dictate current hardware conveniences to software developers is about as likely to succeed as is Vishkin's attempts to dictate current software conveniences to hardware developers. Both of these approaches are increasingly likely to fail should the trend towards special-purpose hardware continue full force. Software developers cannot always ignore the laws of physics, and by the same token, hardware developers cannot always ignore the large quantity of existing software and the large number of existing software developers. To be sure, in order to continue to break new ground, both software and hardware developers will need to continue to learn new tricks, but many of these tricks will require coordinated hardware/software effort.
Sadly, there is no hint in either Dally's or Vishkin's article of any attempt to learn what the many successful developers of concurrent software actually do. Thus, it is entirely possible that neither Dally's nor Vishkin's preferred concurrent computational model is applicable to actual parallel programming practice. In fact, in my experience, developers often use the actual hardware and software as the model, driving their development and optimization efforts from actual measurements and from intuitions built up from those measurements. Some might look down on this approach as being both non-portable and non-mathematical, but portability is often achieved simply by testing on a variety of hardware, courtesy of the relatively low prices of modern computer systems. As for mathematics, those of us who appreciate mathematics would do well to admit that it is often much more efficient to let the objective universe carry out the mathematical calculations in its normal implicit and ineffable manner, especially given the complexity of present day hardware and software.
Circling back to the cynics, there are no doubt some cynics that would summarize this blog post as “please download and read my book”. Except that in contrast to the hypothesized cynical summarization of Dally's and Vishkin's articles, you can download and read my book free of charge. Which is by design, given that the whole point is to promulgate information. Cue cynics calling out which of the three of us is the greater fool! ;-)