Hírolvasó

Matthew Garrett: Digital forgeries are hard

1 hónap 1 hét óta
Closing arguments in the trial between various people and Craig Wright over whether he's Satoshi Nakamoto are wrapping up today, amongst a bewildering array of presented evidence. But one utterly astonishing aspect of this lawsuit is that expert witnesses for both sides agreed that much of the digital evidence provided by Craig Wright was unreliable in one way or another, generally including indications that it wasn't produced at the point in time it claimed to be. And it's fascinating reading through the subtle (and, in some cases, not so subtle) ways that that's revealed.

One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that's been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they'd be strong evidence linking him to the mining of coins before Bitcoin's public availability. Unfortunately the screenshots themselves weren't contemporary - the metadata shows them being created in 2020. This wouldn't fundamentally be a problem (it's entirely reasonable to create new screenshots of old material), as long as it's possible to establish that the material shown in the screenshots was created at that point. Sadly, well.

One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted - this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn't exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn't line up with calendar time - an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.

This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 - before the metadata associated with the screenshots. The implication isn't entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they'd received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they'd received a copy of the email that was submitted - but they'd received it a few days earlier. In 2024.

And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer - and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers - consistent with the mail being sent by a computer with the clock set to 2019.

But there's a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic - the format simply doesn't match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.

This is by no means the only indication that Craig Wright's evidence may be misleading (there's the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it's written in OpenOffice, given that's what the metadata claims), but it's a lovely example of a more general issue.

Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It's fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you're going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it's anachronistic. I'm pretty good at computers and I no longer have any belief I could fake an old document.

(References: this Dropbox, under "Expert reports", "Patrick Madden". Initial MYOB data is in "Appendix PM7", further analysis is in "Appendix PM42", email analysis is "Sixth Expert Report of Mr Patrick Madden")

comments

Daniel Vetter: Upstream, Why & How

1 hónap 1 hét óta

In a different epoch, before the pandemic, I’ve done a presentation about upstream first at the Siemens Linux Community Event 2018, where I’ve tried to explain the fundamentals of open source using microeconomics. Unfortunately that talk didn’t work out too well with an audience that isn’t well-versed in upstream and open source concepts, largely because it was just too much material crammed into too little time.

Last year I got the opportunity to try again for an Intel-internal event series, and this time I’ve split the material into two parts. I think that worked a lot better. For obvious reasons I cannot publish the recordings, but I can publish the slides.

The first part “Upstream, Why?” covers a few concepts from microeconomcis 101, and then applies them to upstream stream open source. The key concept is on one hand that open source achieves an efficient software market in the microeconomic sense by driving margins and prices to zero. And the only way to make money in such a market is to either have more-or-less unstable barriers to entry that prevent the efficient market from forming and destroying all monetary value. Or to sell a complementary product.

The second part”Upstream, How?” then looks at what this all means for the different stakeholders involved:

  • Individual engineers, who have skills and create a product with zero economic value, and might still be stupid enough and try to build a career on that.

  • Upstream communities, often with a formal structure as a foundation, and what exactly their goals should be to build a thriving upstream open source project that can actually pay some bills, generate some revenue somewhere else and get engineers paid. Because without that you’re not going to have much of a project with a long term future.

  • Engineering organizations, what exactly their incentives and goals should be, and the fundamental conflicts of interest this causes. Specifically on this I’ve only seen bad solutions, and ugly solutions, but not yet a really good one. A relevant pre-pandemic talk of mine on this topic is also “Upstream Graphics: Too Little, Too Late”

  • And finally the overall business and more importantly, what kind of business strategy is needed to really thrive with an open source upstream first approach: You need to clearly understand which software market’s economic value you want to destroy by driving margins and prices to zero, and which complemenetary product you’re selling to still earn money.

At least judging by the feedback I’ve received internally taking more time and going a bit more in-depth on the various concept worked much better than the keynote presentation I’ve done at Siemens, hence I decided to publish at the least the slides.

[$] Questions about machine-learning models for Fedora

1 hónap 1 hét óta

Kaitlyn Abdo of Fedora's AI/ML SIG opened an issue with the Fedora Engineering Steering Committee (FESCo) recently that carried a few tricky questions about packaging machine-learning (ML) models for Fedora. Specifically, the SIG is looking for guidance on whether pre-trained weights for PyTorch constitute code or content. And, if the models are released under a license approved by the Open Source Initiative (OSI), does it matter what data the models were trained on? The issue was quickly tossed over to Fedora's legal mailing list and sparked an interesting discussion about how to handle these items, and a temporary path forward.

jzb

Security updates for Wednesday

1 hónap 1 hét óta
Security updates have been issued by Fedora (edk2, freeipa, kernel, and liblas), Oracle (kernel), Red Hat (docker, edk2, kernel, kernel-rt, and kpatch-patch), SUSE (axis, fontforge, gnutls, java-1_8_0-openjdk, kernel, python3, sudo, and zabbix), and Ubuntu (dotnet7, dotnet8, libgoogle-gson-java, openssl, and ovn).
jzb

[$] A new filesystem for pidfds

1 hónap 1 hét óta
The pidfd abstraction is a Linux-specific way of referring to processes that avoids the race conditions inherent in Unix process ID numbers. Since a pidfd is a file descriptor, it needs a filesystem to implement the usual operations performed on files. As the use of pidfds has grown, they have stressed the limits of the simple filesystem that was created for them. Christian Brauner has created a new filesystem for pidfds that seems likely to debut in the 6.9 kernel, but it ran into a little bump along the way, demonstrating that things you cannot see can still hurt you.
corbet

Today's hardware vulnerability: register file data sampling

1 hónap 1 hét óta
The mainline kernel has just received a set of commits addressing the "register file data sampling" hardware vulnerability.

RFDS may allow a malicious actor to infer data values previously used in floating point registers, vector registers, or integer registers. RFDS does not provide the ability to choose which data is inferred

Only Atom cores are affected, but those cores can be found inside a number of processors. See this documentation commit for more information.

corbet

Herb Sutter on increasing safety in C++

1 hónap 1 hét óta

Herb Sutter, chair of the ISO C++ standards committee, writes about the current problems with writing secure C++, and his personal opinion on next steps to address this while maintaining backward compatibility.

If there were 90-98% fewer C++ type/bounds/initialization/lifetime vulnerabilities we wouldn't be having this discussion. All languages have CVEs, C++ just has more (and C still more); so far in 2024, Rust has 6 CVEs, and C and C++ combined have 61 CVEs. So zero isn't the goal; something like a 90% reduction is necessary, and a 98% reduction is sufficient, to achieve security parity with the levels of language safety provided by MSLs [memory-safe languages]… and has the strong benefit that I believe it can be achieved with perfect backward link compatibility (i.e., without changing C++'s object model, and its lifetime model which does not depend on universal tracing garbage collection and is not limited to tree-based data structures) which is essential to our being able to adopt the improvements in existing C++ projects as easily as we can adopt other new editions of C++. — After that, we can pursue additional improvements to other buckets, such as thread safety and overflow safety.
daroc

[$] Insecurity and Python pickles

1 hónap 1 hét óta

Serialization is the process of transforming Python objects into a sequence of bytes which can be used to recreate a copy of the object later — or on another machine. pickle is Python's native serialization module. It can store complex Python objects, making it an appealing prospect for moving data without having to write custom serialization code. For example, pickle is an integral component of several file formats used for machine learning. However, using pickle to deserialize untrusted files is a major security risk, because doing so can invoke arbitrary Python functions. Consequently, the machine-learning community is working to address the security issues caused by widespread use of pickle.

daroc

Security updates for Tuesday

1 hónap 1 hét óta
Security updates have been issued by Debian (qemu), Mageia (libtiff and thunderbird), Red Hat (kernel, kpatch-patch, postgresql, and rhc-worker-script), SUSE (compat-openssl098, openssl, openssl1, python-Django, python-Django1, and wpa_supplicant), and Ubuntu (accountsservice, libxml2, linux-bluefield, linux-raspi-5.4, linux-xilinx-zynqmp, linux-oem-6.1, openvswitch, postgresql-9.5, and ruby-rack).
corbet