It was ethical to exploit the lightning bug – Bitcoin Magazine
This is an opinion piece by Shinobi, a self-taught Bitcoin educator and tech-savvy Bitcoin podcast host.
For the second time in about a month, btcd/LND has had a bug exploited that caused them to diverge in consensus from Bitcoin Core. Once again Burak was the developer who triggered this vulnerability – this time it was clearly on purpose – and once again it was a problem with code to analyze Bitcoin transactions above the consensus layer. As I discussed in my piece on the previous bug that Burak triggered, before Taproot there were limits to how large the script and witness data in a transaction could be. With the activation of Taproot, these limits were removed, leaving only the restrictions on the block size itself to limit these parts of individual transactions. The problem with the latest bug was that although the consensus code in btcd was properly upgraded to reflect this change, the code that handles peer-to-peer transfer – including parsing data before sending or receiving – did not upgrade properly. So the code processing blocks and transactions before they were actually sent to be validated for consensus failed the data, never sent it to the consensus validation logic and the block in question was never validated.
A very similar thing happened this time. Another limit in the peer-to-peer part of the codebase was incorrectly enforcing a limit on witness data, limiting it to a maximum of 1/8 of the block size as opposed to the full block size. Burak made a transaction with witness data just a single weight unit above the strict limit and once again stopped btcd and LND nodes at that block height. This transaction was a non-standard transaction, meaning that while it is perfectly valid according to consensus rules, it is not valid according to standard mempool policy, and so nodes will not forward it across the network. It is entirely possible to have it mined in a block, but the only way to do it is to give it directly to a miner, which Burak did using F2Pool.
This really drives home the point that any code whose purpose is to analyze and validate Bitcoin data needs to be thoroughly audited to ensure it is in line with what Bitcoin Core wants to do. It doesn’t matter if that code is the consensus engine for a node implementation or just a piece of code that sends transactions around for a Lightning node. This second error was literally right above the one from last month in the codebase. It wasn’t even spotted by anyone at Lightning Labs. AJ Towns reported it on October 11, two days after the original bug was triggered by Burak’s 998-of-999 multisig transaction. It was published on Github for 10 hours before it was deleted. A fix was then created, but not released, with the intention of addressing the issue in the next release of LND.
Now this is pretty standard procedure for a serious vulnerability, especially with a project like Bitcoin Core where such a vulnerability could actually cause serious damage to the base layer network/protocol. But in this specific case, it posed a serious risk to LND users’ funds, and given the fact that it was literally right next to the previous bug that had the same risk, the chances that it would be found and exploited were very high, as demonstrated by Burak. This raises the question of whether the silent-patch approach is the way to go when it comes to vulnerabilities like this that could leave users open to fund theft (because their node is unable to detect old channel states and penalize them properly).
As I touched on in my piece on the last flaw, if a malicious actor had found the flaws before a well-intentioned developer, they could have tactically opened new channels to vulnerable nodes, routed the entire contents of those channels back to themselves, and then exploited the flaw . From there they would have these funds under their control and also be able to close the channel with the original state, literally doubling their money. What Burak did by actively exploiting this problem in an ironic way actually protected LND users from such an attack.
Once exploited, users were open to such attacks from existing peers with whom they already had open channels, but were no longer able to be targeted specifically with new channels. Their nodes were stopped and would never recognize or process payments through channels someone tried to open after the block that stopped their node. So while it didn’t completely remove the risk of users being exploited, it did limit the risk to people they already had a channel with. Burak’s action dampened it. Personally, I think this kind of action in response to the error made sense; it limited the damage, made users aware of the risk and led to it being quickly patched.
Nor was LND the only one affected. Liquid the pegging process was also brokenwhich requires updates to union officials to fix it. Older versions of Rust Bitcoin was also affected, causing the stable to affect some block explorers and electrs instances (an implementation of the Electrum Wallet backend server). Now, with the exception of Liquid’s stick that eventually exposes funds to the emergency recovery keys that Blockstream has after the expiration of a timelock – and realistically in the movie plot where Blockstream stole those funds, everyone knows exactly who to go after – these other problems never put anyone’s funds in danger at any time. Moreover, Rust Bitcoin had actually fixed this specific bug in recent versions, apparently leading to no communication with maintainers of other codebases to highlight the potential for such problems. It was only the active exploitation of the bug live on the network that revealed that the problem existed in multiple codebases.
This brings up some big issues when it comes to vulnerabilities like this in Layer 2 software on Bitcoin. First, how seriously these codebases are audited for security flaws, and how that is prioritized versus integrating new features. I find it very telling that security is not always a priority given that this second bug was not even found by the maintainers of the codebase where it was present, even though it was literally right next to the first bug discovered last month . After a major bug that put users’ funds at risk, no internal audit of that code base was done? It took someone outside the project to discover it? It does not show a priority to secure users’ funds over building new features to attract more users. Second, the fact that this issue was already fixed in Rust Bitcoin shows a lack of communication across maintainers of different codebases regarding bugs like this. This is quite understandable, since being completely different codebases doesn’t make someone who found a bug in one immediately think, “I should contact other teams writing similar software in completely different programming languages to warn them about the potential for such a bug .” You don’t find a bug in Windows and immediately think to report the bug to Linux kernel maintainers. However, Bitcoin as a protocol for distributed consensus over a global network is a completely different beast; maybe Bitcoin developers should start thinking along those lines when it comes to vulnerabilities in Bitcoin software. Especially when it comes to analyzing and interpreting consensus-related data.
Finally, perhaps in the case of protocols like Lightning, which rely on observing the blockchain at all times to be able to react to old channel states to maintain security, independent parsing and verification of data should be kept to an absolute minimum – if not removed entirely and delegated to Bitcoin Core or data directly derived from it. Core Lightning is built this way, connecting to an instance of Bitcoin Core and completely relying on it for validating blocks and transactions. If LND worked the same way, none of these failures in btcd would have affected LND users in a way that put their money at risk.
Regardless of how things are handled—either outsourcing validation entirely or simply minimizing internal validation and approaching it with much more care—this incident shows that something needs to change when approaching the question of how Layer 2 software handles interaction with consensus-related data. Once again, everyone is very lucky that this wasn’t exploited by a malicious actor, but instead by a developer proving a point. That said, Bitcoin cannot rely on luck or hope that malicious actors do not exist.
Developers and users should focus on improving processes to prevent incidents like this from happening again, and not play the blame game like a hot potato.
This is a guest post by Shinobi. Opinions expressed are entirely their own and do not necessarily reflect the opinions of BTC Inc or Bitcoin Magazine.