Hírolvasó

A glitch in the merge window

1 év 7 hónap óta
On January 13, Linus Torvalds let it be known that he had lost power due to the bad weather in the US Pacific Northwest. As of this writing, he has not yet resurfaced, so the 6.8 merge window has ground to a halt.

There's apparently about 100k people without power, and I doubt our neighborhood is the priority, so I expect to be without power for some time still. I hope I'm wrong, but a few years ago it took more than a week to restore power due to all the downed trees. It's hopefully nowhere near that, but..

corbet

Security updates for Tuesday

1 év 7 hónap óta
Security updates have been issued by Gentoo (KTextEditor, libspf2, libuv, and Nettle), Mageia (hplip), Oracle (container-tools:4.0, gnutls, idm:DL1, squid, squid34, and virt:ol, virt-devel:rhel), Red Hat (.NET 6.0, krb5, python3, rsync, and sqlite), SUSE (chromium, perl-Spreadsheet-ParseXLSX, postgresql, postgresql15, postgresql16, and rubygem-actionpack-5_1), and Ubuntu (binutils, libspf2, libssh2, mysql-5.7, w3m, webkit2gtk, and xerces-c).
corbet

James Bottomley: Debugging Android Early Boot Failures

1 év 7 hónap óta

Back in my blog post about Securing the Google SIP Stack, I did say I’d look at re-enabling SIP in Android-12, so with a view to doing that I tried building and booting LineageOS 19.1, but it crashed really early in the boot sequence (after the boot splash but before the boot animation started). It turns out that information on debugging the android early boot sequence is a bit scarce, so I thought I should write a post about how I did it just in case it helps someone else who’s struggling with a similar early boot problem.

How I usually Build and Boot Android

My builds are standard LineageOS with my patches to fix SIP and not much else. However, I do replace the debug keys with my signing keys and I also have an AVB key installed in the phone’s third party keyslot with which I sign the vbmeta for boot. This actually means that my phone is effectively locked but with a user supplied key (Yellow as google puts it).

My phone is now a pixel 3 (I had to say goodbye to the old Nexus One thanks to the US 3G turn off) and I do have a slightly broken Pixel 3 I play with for experimental patches, which is where I was trying to install Android-12.

Signing Seems to be the Problem

Just to verify my phone could actually boot a stock LineageOS (it could) I had to unlock it and this lead to the discovery that once unlocked, it would also boot my custom rom as well, so whatever was failing in early boot seemed to be connected with the device being locked.

I also discovered an interesting bug in the recovery rom fastboot: If you’re booting locked with your own keys, it will still let you perform all the usually forbidden fastboot commands (the one I was using was set_active). It turns out to be because of a bug in AOSP which treats yellow devices as unlocked in fastboot. Somewhat handy for debugging, but not so hot for security …

And so to Debugging Early Boot

The big problem with Android is there’s no way to get the console messages for early boot. Even if you enable adb early, it doesn’t get started until quite far in to the boot animation (which was way after the crash I was tripping over). However, android does have a pstore (previously ramoops) driver that can give you access to the previously crashed boot’s kernel messages (early init, fortunately, mostly logs to the kernel message log).

Forcing init to crash on failure

Ordinarily an init failure prints a message and reboots (to the bootloader), which doesn’t excite pstore into saving the kernel message log. fortunately there is a boot option (androidboot.init_fatal_panic) which can be set in the boot options (or kernel command line for a pixel-3 which can only boot the 4.9 kernel). If you build your own android, it’s fairly easy to add things to the android commandline (which is in boot.img) because all you need to do is extract BOOT/cmdline from the intermediate zip file you sign add any boot options you need and place it back in the zip file (before you sign it).

Unfortunately, this expedient didn’t work (no console logs appear in pstore). I did check that init was correctly panic’ing on failure by inducing an init failure in recovery mode and observing the panic (recovery mode allows you to run adb). But this induced panic also didn’t show up in pstore, meaning there’s actually some problem with pstore and early panics.

Security is the problem (as usual)

The actual problem turned out to be security (as usual): The pixel-3 does encrypted boot panic logs. The way this seems to work (at least in my reading of the google additional pstore patches) is that the bootloader itself encrypts the pstore ram area with a key on the /data partition, which means it only becomes visible after the device is unlocked. Unfortunately, if you trigger a panic before the device is unlocked (by echoing ‘c’ to /proc/sysrq-trigger) the panic message is lost, so pstore itself is useless for debugging early boot. There seems to be some communication of the keys by the vendor proprietary ramoops binary making it very difficult to figure out how it’s being done.

Why the early panic message is lost is a bit mysterious, but unfortunately pstore on the pixel-3 has several proprietary components around the encrypted message handling that make it hard to debug. I suspect if you don’t set up the pstore encryption keys, the bootloader erases the pstore ram area instead of encrypting it, but I can’t prove that.

Although it might be possible to fix the pstore drivers to preserve the ramoops from before device unlock, the participation of the proprietary bootloader in preserving the memory doesn’t make that look like a promising avenue to explore.

Anatomy of the Pixel-3 Boot Sequence

The Pixel-3 device boots through recovery. What this means is that the initial ramdisk (from boot.img) init is what boots both the recovery and normal boot paths. The only difference is that for recovery (and fastboot), the device stays in the ramdisk and for normal boot it mounts the /system partition and pivots to it. What makes this happen or not is the boot flag androidboot.force_normal_boot=1 which is added by the bootloader. Pretty much all the binary content and init rc files in the ramdisk are for recovery and its allied menus.

Since the boot paths are pretty radically different, because the normal boot first pivots to a first stage before going on to a second, but in the manner of containers, it might be possible to boot recovery first, start a dmesg logger and then re-exec init through the normal path

Forcing Re-Exec

The idea is to signal init to re-exec itself for the normal path. Of course, there have to be a few changes to do this: An item has to be added to the recovery menu to signal init and init itself has to be modified to do the re-exec on the signal (note you can’t just kick off an init with a new command line because init must be pid 1 for booting). Once this is done, there are problems with selinux (it won’t actually allow init to re-exec) and some mount moves. The selinux problem is fixable by switching it from enforcing to permissive (boot option androidboot.selinux=permissive) and the mount moves (which are forbidden if you’re running binaries from the mount points being moved) can instead become bind mounts. The whole patch becomes 31 insertions across 7 files in android_system_core.

The signal I chose was SIGUSR1, which isn’t usually used by anything in the bootloader and the addition of a menu item to recovery to send this signal to init was also another trivial patch. So finally we have a system from which I can start adb to trace the kernel log (adb shell dmesg -w) and then signal to init to re-exec. Surprisingly this worked and produced as the last message fragment:

[ 190.966881] init: [libfs_mgr]Created logical partition system_a on device /dev/block/dm-0 [ 190.967697] init: [libfs_mgr]Created logical partition vendor_a on device /dev/block/dm-1 [ 190.968367] init: [libfs_mgr]Created logical partition product_a on device /dev/block/dm-2 [ 190.969024] init: [libfs_mgr]Created logical partition system_ext_a on device /dev/block/dm-3 [ 190.969067] init: DSU not detected, proceeding with normal boot [ 190.982957] init: [libfs_avb]Invalid hash size: [ 190.982967] init: [libfs_avb]Failed to verify vbmeta digest [ 190.982972] init: [libfs_avb]vbmeta digest error isn't allowed [ 190.982980] init: Failed to open AvbHandle: No such file or directory [ 190.982987] init: Failed to setup verity for '/system': No such file or directory [ 190.982993] init: Failed to mount /system: No such file or directory [ 190.983030] init: Failed to mount required partitions early … [ 190.983483] init: InitFatalReboot: signal 6 [ 190.984849] init: #00 pc 0000000000123b38 /system/bin/init [ 190.984857] init: #01 pc 00000000000bc9a8 /system/bin/init [ 190.984864] init: #02 pc 000000000001595c /system/lib64/libbase.so [ 190.984869] init: #03 pc 0000000000014f8c /system/lib64/libbase.so [ 190.984874] init: #04 pc 00000000000e6984 /system/bin/init [ 190.984878] init: #05 pc 00000000000aa144 /system/bin/init [ 190.984883] init: #06 pc 00000000000487dc /system/lib64/libc.so [ 190.984889] init: Reboot ending, jumping to kernel

Which indicates exactly where the problem is.

Fixing the problem

Once the messages are identified, the problem turns out to be in system/core ec10d3cf6 “libfs_avb: verifying vbmeta digest early”, which is inherited from AOSP and which even says in in it’s commit message “the device will not boot if: 1. The image is signed with FLAGS_VERIFICATION_DISABLED is set 2. The device state is locked” which is basically my boot state, so thanks for that one google. Reverting this commit can be done cleanly and now the signed image boots without a problem.

I note that I could also simply add hashtree verification to my boot, but LineageOS is based on the eng target, which has FLAGS_VERIFICATION_DISABLED built into the main build Makefile. It might be possible to change it, but not easily I’m guessing … although I might try fixing it this way at some point, since it would make my phones much more secure.

Conclusion

Debugging android early boot is still a terribly hard problem. Probably someone with more patience for disassembling proprietary binaries could take apart pixel-3 vendor ramoops and figure out if it’s possible to get a pstore oops log out of early boot (which would be the easiest way to debug problems). But failing that the simple hack to re-exec init worked enough to show me where the problem was (of course, if init had continued longer it would likely have run into other issues caused by the way I hacked it).

Effortless OpenBSD Audio and Desktop Screen Recording Guide

1 év 7 hónap óta

Rafael Sadowski (rsadowski@) has added a new post to his Shut up and hack series, titled Effortless OpenBSD Audio and Desktop Screen Recording Guide, where he takes the reader through the steps needed to configure your OpenBSD system for audio and video recording. The post even includes a youtube video where he demonstrates recording while he is putting final touches on the blog post.

You can take in the blog post here: Effortless OpenBSD Audio and Desktop Screen Recording Guide.

OpenSUSE Leap 16 is coming

1 év 7 hónap óta
The openSUSE project has confirmed that there will be a successor to openSUSE Leap 15, but is not sharing a lot of details at this point.

The transition to Leap 16 is not just a numerical step-up but symbolizes a significant path forward in technology and user experiences. The future of openSUSE Leap is based on the innovative concept of SUSE’s Adaptable Linux Platform.

The Adaptable Linux Platform powers the next-generation openSUSE Leap, Leap Micro, and SUSE solutions. It makes distributions more adaptable and suitable for cloud-native workloads while also being capable of handling a rapid pace of innovation.

corbet

Stawinski: How We Executed a Critical Supply Chain Attack on PyTorch

1 év 7 hónap óta
John Stawinski IV describes, in detail, how he and a partner were able to compromise the security of the heavily used PyTorch project.

Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to AWS, potentially add code to the main repository branch, backdoor PyTorch dependencies – the list goes on. In short, it was bad. Quite bad.

As we’ve seen before with SolarWinds, Ledger, and others, supply chain attacks like this are killer from an attacker’s perspective. With this level of access, any respectable nation-state would have several paths to a PyTorch supply chain compromise.

corbet

[$] Rust and C filesystem APIs

1 év 7 hónap óta
As the Rust-for-Linux project advances, the kernel is gradually accumulating abstraction layers that enable Rust code to interface with the existing C code. As the discussion around the set of filesystem abstractions posted by Wedson Almeida Filho in December shows, though, there is some tension between two approaches to the design of those abstractions. The approach favored by most of the kernel's C programmers looks set to win out, but this is a discussion that is likely to return as the use of Rust in the kernel grows.
corbet

Security updates for Monday

1 év 7 hónap óta
Security updates have been issued by CentOS (bind, cups, curl, firefox, ipa, iperf3, java-1.8.0-openjdk, java-11-openjdk, kernel, libssh2, linux-firmware, open-vm-tools, openssh, postgresql, python, python3, squid, thunderbird, tigervnc, and xorg-x11-server), Fedora (chromium, python-flask-security-too, and tkimg), Gentoo (libgit2, Opera, QPDF, and zlib), Mageia (chromium-browser-stable, gnutls, openssh, packages, and vlc), Oracle (.NET 6.0, fence-agents, frr, ipa, kernel, nss, pixman, and tomcat), and SUSE (gstreamer-plugins-bad).
jake