-
Notifications
You must be signed in to change notification settings - Fork 38.9k
coins: don't mutate main cache when connecting block #34165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34165. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation talks about block connection but the implementation was added to block disconnect. I don't yet know if we should apply it there or not, but I find it very likely that we should do it for block connection.
A lot of tests were added for testing the new cache in isolation and they all passed while it was applied to the wrong place as far as I can tell.
We should add a new integration test that actually checks that invalid blocks don't pollute the underlying cache:
diff --git a/src/test/validation_chainstate_tests.cpp b/src/test/validation_chainstate_tests.cpp
--- a/src/test/validation_chainstate_tests.cpp (revision 20f0b8304d089d641060f693d3284e69524fbcbe)
+++ b/src/test/validation_chainstate_tests.cpp (date 1767360120004)
@@ -3,10 +3,12 @@
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
//
#include <chainparams.h>
+#include <consensus/amount.h>
#include <consensus/validation.h>
#include <node/kernel_notifications.h>
#include <random.h>
#include <rpc/blockchain.h>
+#include <script/script.h>
#include <sync.h>
#include <test/util/chainstate.h>
#include <test/util/coins.h>
@@ -63,6 +65,30 @@
}
}
+BOOST_FIXTURE_TEST_CASE(connect_tip_does_not_cache_inputs_on_failed_connect, TestChain100Setup)
+{
+ Chainstate& chainstate = Assert(m_node.chainman)->ActiveChainstate();
+
+ COutPoint outpoint;
+ {
+ LOCK(cs_main);
+ outpoint = AddTestCoin(m_rng, chainstate.CoinsTip());
+ chainstate.CoinsTip().Flush(/*will_reuse_cache=*/false);
+ }
+
+ CMutableTransaction tx;
+ tx.vin.emplace_back(outpoint);
+ tx.vout.emplace_back(MAX_MONEY, CScript{} << OP_TRUE);
+
+ const auto tip{WITH_LOCK(cs_main, return chainstate.m_chain.Tip()->GetBlockHash())};
+ CBlock block = CreateBlock({tx}, CScript{} << OP_TRUE, chainstate);
+ Assert(m_node.chainman)->ProcessNewBlock(std::make_shared<CBlock>(block), true, true, nullptr);
+
+ LOCK(cs_main);
+ BOOST_CHECK_EQUAL(tip, chainstate.m_chain.Tip()->GetBlockHash()); // block rejected
+ BOOST_CHECK(!chainstate.CoinsTip().HaveCoinInCache(outpoint)); // input not cached
+}
+
//! Test UpdateTip behavior for both active and background chainstates.
//!
//! When run on the background chainstate, UpdateTip should do a subset
diff --git a/src/validation.cpp b/src/validation.cpp
--- a/src/validation.cpp (revision 20f0b8304d089d641060f693d3284e69524fbcbe)
+++ b/src/validation.cpp (date 1767360154724)
@@ -3102,7 +3102,7 @@
LogDebug(BCLog::BENCH, " - Load block from disk: %.2fms\n",
Ticks<MillisecondsDouble>(time_2 - time_1));
{
- CCoinsViewCache view(&CoinsTip());
+ CoinsViewCacheNonMutating view(&CoinsTip());
bool rv = ConnectBlock(*block_to_connect, state, pindexNew, view);
if (m_chainman.m_options.signals) {
m_chainman.m_options.signals->BlockChecked(block_to_connect, state);Concept ACK (edited to remove the nack).
9739ce6 to
7cd938b
Compare
|
Thank you @l0rinc, I have taken your integration test. I had unit tests for this behavior but the integration test is a great idea. |
The summary comment parser treated "Concept ACK, but approach NACK." as a `Concept NACK` because it only matched "Approach NACK" with the exact capitalization. Make the "Approach NACK" pattern case-insensitive and add a regression test. Ref: bitcoin/bitcoin#34165 (review)
410a014 to
7a6e616
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
7a6e616 to
9a37b19
Compare
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll run a reindex-chainstate on these to see how they behave - as long as they don't slow down anything, I would be fine with them.
9a37b19 to
7fab35c
Compare
The summary comment parser treated "Concept ACK, but approach NACK." as a `Concept NACK` because it only matched "Approach NACK" with the exact capitalization. Make the "Approach/Concept N?ACK" pattern capitalization-insensitive and add regression tests. Ref: bitcoin/bitcoin#34165 (review)
src/validation.cpp
Outdated
| Ticks<MillisecondsDouble>(time_2 - time_1)); | ||
| { | ||
| CCoinsViewCache view(&CoinsTip()); | ||
| CoinsViewCacheNonMutating view(&CoinsTip()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here or in a follow-up we could add this to TestBlockValidity as well, where it would be even more relevant since it’s invoked with without PoW checks (so your PR arguments apply here even more)
7fab35c to
2e67c5b
Compare
|
Ran a ACK 2e67c5b |
willcl-ark
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice carve-out from #31132.
Left a comment about whether we want to add a CoinsViewErrorCatcher wrapper for PeekCoin, but otherwise looks good to me.
2e67c5b to
3a2cfbe
Compare
|
Thank you for your review @willcl-ark. Addressed #34165 (comment) and #34165 (comment), and added
|
sipa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 70e93a1, with a small suggestion
sedited
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 70e93a1
willcl-ark
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 70e93a1. Behavior change seems very good and I did not see any problems with the implementation. But it does seem like it could be simplified. Like I don't understand why this wouldn't just add a bool use_peek option to CCoinsViewCache instead of introducing a new CoinsViewOverlay class and GetCoinFromBase override. Also I don't understand why CoinsViewOverlay seems to be tested in some different and more complicated ways than CCoinsViewCache. (Details in comments below.)
I could be missing some context, and this PR does have a lot of acks so may not be worth changing. But I suspect various simplifications and cleanups could be possible here, and maybe done in followups.
| TRACEPOINT_SEMAPHORE(utxocache, uncache); | ||
|
|
||
| std::optional<Coin> CCoinsView::GetCoin(const COutPoint& outpoint) const { return std::nullopt; } | ||
| std::optional<Coin> CCoinsView::PeekCoin(const COutPoint& outpoint) const { return GetCoin(outpoint); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "coins: add PeekCoin()" (c40f6d7)
Two issues I think would be good to address in this commit:
PeekCoinis documented as not modifying the cache, but the default implementation callsGetCoincan modify it.GetCoinandHaveCoinshould probably have documentation saying now that they may modify the cache, to contrast withPeekCoin
The problem with default PeekCoin implementation calling GetCoin could be minimally addressed by just not providing any PeekCoin implementation, which would be in keeping with #34124. IMO, a nicer solution would be to make PeekCoin, GetCoin, and HaveCoin all non-virtual methods, with simpler and more clearly distinguished hooks for subclasses to implement:
virtual bool LookupCoin(const COutPoint& outpoint, Coin* coin = nullptr) const = 0;
virtual bool MutableLookupCoin(const COutPoint& outpoint, Coin* coin = nullptr) = 0;642fa06 is a full implementation to show what this would look like.
More ideally all of this should just be enforced with const. Only methods that don't mutate the cache should be const and others should be non-const. However, this would be a bigger change since const is currently being used to guard against BatchWrite being called. To fix that it would make sense to drop BatchWrite from the coins "view" interface and add it to a separate "store" interface.
So this is kind of a rabbithole. But I think it's important to at least document these methods better and to prevent the default PeekCoin implmentation from being able to modify the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR does have a lot of acks so may not be worth changing
I don't mind re-reviewing, your suggestions are always very useful.
642fa06 is a full implementation to show what this would look like.
This kinda' undoes #30849 and has a lot of renames, but I like the new comments.
More ideally all of this should just be enforced with const.
However, this would be a bigger change
👍 - we should do that in a follow-up!
would make sense to drop BatchWrite from the coins "view" interface
Yes, we will have to segregate that interface, it has multiple issues, but first we need to gradually clean up usage to fully understand the narrowest interfaces, see #34280
at least document these methods better and to prevent the default PeekCoin implementation
Agree, I also won't mind if we do that in follow-up since we will be touching all of these in other cleanup PRs as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a simpler solution for now is to just make the default PeekCoin return std::nullopt like GetCoin, and override PeekCoin in CCoinsViewDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the default PeekCoin return std::nullopt, so subclasses have to explicitly opt in. This should pair well with #34124 as well.
Also took some of your comments from 642fa06 and added you as a co-author.
I think the further improvements re: LookupCoin and constness are interesting but could be better explored in a more focused follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted this change. I think it's safe to forward to GetCoin here at the base. This would only be an issue if there was a stack like CCoinsViewCache -> CCoinsView -> CCoinsViewCache. In production code, there is only really CoinsViewOverlay -> CCoinsViewCache -> CCoinsViewDB. It doesn't matter for the db to have PeekCoin because it doesn't cache anything. I think this is ok to merge as is, and further improvements will follow in #34124.
| Reset(); | ||
| for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.NextAndMaybeErase(*it)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "fuzz: add target for CoinsViewOverlay" (babb80d)
What's the implication of clearing the cache, undoing this write and all previous writes here? Does it make the test coverage less meaningful when this happens?
The fact that reseting this cache is ok me question what the point is of calling TestCoinsView with 3 layers of views (CCoinsView + MutationGuardCoinsViewCache + CoinsViewOverlay) when the existing fuzz tests are fine with just two layers (CCoinsView + CCoinsViewCache) and (CCoinsViewDB + CCoinsViewCache).
Why not follow the previous pattern and test (CCoinsView + CoinsViewOverlay) and (CCoinsViewDB + CoinsViewOverlay), just overriding the CoinsViewOverlay::BatchWrite method to do the
assert(ComputeCacheCoinsSnapshot() == m_expected_snapshot) check? This would seem more consistent and avoid the need for extra complexity and reduced reliability of the test this Reset() call seems to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the implication of clearing the cache, undoing this write and all previous writes here? Does it make the test coverage less meaningful when this happens?
If this exception is thrown it puts the two caches in an invalid state. In production code we do not catch this error and crash. In other parts of this test we set good_data = false; when we want the fuzzer to not go further, but here we would already be in an invalid state so we need to reset to continue.
what the point is of calling TestCoinsView with 3 layers of views
CoinsViewOverlay is designed to not mutate the parent CCoinsViewCache. So, we need a CoinsViewOverlay backed via a CCoinsViewCache in this case. However, a CCoinsViewCache cannot exist by itself without a backing view, so we need to add a third CCoinsView to it.
|
Those seem like very useful review comments, @ryanofsky. I'm happy to re-review if any of them were to be addressed in this PR. |
70e93a1 to
6ca865f
Compare
|
Thank you for your review @ryanofsky! I have taken some of your suggestions and left clarifying comments for others.
|
src/txdb.cpp
Outdated
|
|
||
| std::optional<Coin> CCoinsViewDB::PeekCoin(const COutPoint& outpoint) const | ||
| { | ||
| return GetCoin(outpoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be vice-versa for safety, i.e. GetCoin delegating to PeekCoin so that we narrow the contract instead of expanding it? Same for coinscache_sim.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right! We can also solve the issue pointed out in #34165 (comment) by making CCoinsView::GetCoin delegate to PeekCoin instead of vice-versa. This makes the base case non-mutating and avoids a footgun of having to define both methods. Otherwise one path would return a Coin the other nullopt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is only safe if PeekCoin is a pure virtual method so subclasses can't miss overriding it.
6ca865f to
1dea8f5
Compare
1dea8f5 to
890215c
Compare
890215c to
5b524f3
Compare
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review 5b524f3. Thanks for the updates and explanations. Since last review, GetCoin was changed to delegate to PeekCoin instead of the reverse, which resolves my original concern about PeekCoin implementations unintentionally modifying the cache, but raises other concerns (see below). Other than that, some comments were just moved and added, some code moved between commits, and GetCoinFromBase was renamed (thanks!).
If my concerns about GetCoin calling PeekCoin aren't real or aren't important, would be happy to re-ack. Otherwise it might be better to go back to the previous approach, or just make these methods pure-virtual and accept a little boilerplate in the subclasses.
At a very high level, I will say the approach of making sure connect block code does not modify the cache by passing a different coinsview at runtime does not seem ideal to me. More ideal would be for the coins view to have a const method for looking up without modifying the cache, and a non-const method for looking up with caching, and the connect block code would be changed directly to use non-caching lookups, and the compiler would enforce this because it is passed a const view. Still, this runtime implementation seems like an improvement over status quo, and I'm probably not seeing the bigger picture here since these changes are used in follow up PRs
Introduce a helper to look up a Coin through a stack of CCoinsViewCache layers without populating parent caches. This is useful for ephemeral views (e.g. during ConnectBlock) that want to avoid polluting CoinsTip() when validating invalid blocks. Co-authored-by: l0rinc <[email protected]> Co-authored-by: Pieter Wuille <[email protected]> Co-authored-by: Ryan Ofsky <[email protected]>
Introduce `CoinsViewOverlay`, a `CCoinsViewCache` subclass that reads coins without mutating the underlying cache via `FetchCoin()`. Use `PeekCoin()` to look up a Coin through a stack of `CCoinsViewCache` layers without populating parent caches. This prevents the main cache from caching inputs pulled from disk for a block that has not yet been fully validated. Once `Flush()` is called on the view, these inputs will be added as spent to `coinsCache` in the main cache via `BatchWrite()`. This is the foundation for async input fetching, where worker threads must not mutate shared state. Co-authored-by: l0rinc <[email protected]>
Use `CoinsViewOverlay` when connecting blocks in `ConnectTip`. Add a new integration test to verify that using CoinsViewOverlay does not mutate the main cache during validation for an invalid block. Co-authored-by: l0rinc <[email protected]>
Refactor TestCoinsView() to accept the cache as a parameter instead of creating it internally. This prepares for adding a CoinsViewOverlay fuzz target that needs to pass in a different cache type. This is a non-functional change. Co-authored-by: l0rinc <[email protected]>
Refactor TestCoinsView() to move code that directly modifies backend_coins_view to the end of the function. This prepares for a CoinsViewOverlay fuzz target that asserts the backend_coins_view is not mutated by any methods before BatchWrite is called. Co-authored-by: l0rinc <[email protected]>
Co-authored-by: l0rinc <[email protected]>
5b524f3 to
cae6d89
Compare
|
Thanks again for your review @ryanofsky. I reverted the change to
|
This is a slightly modified version of the first few commits of #31132, which can be merged as an independent change. It has a small benefit on its own, but will help in moving the parent PR forward.
When accessing coins via the
CCoinsViewCache, methods likeGetCoincan callFetchCoinwhich actually mutatecacheCoinsinternally to cache entries when they are pulled from the backing db. This is generally a performance improvement for single threaded access patterns, but it precludes us from accessing entries in aCCoinsViewCachefrom multiple threads without a lock.Another aspect is that when we use the resettable
CCoinsViewCacheview backed by the main cache for use inConnectBlock(), we will insert entries into the main cache even if the block is determined to be invalid. This is not the biggest concern, since an invalid block requires proof-of-work. But, an attacker could craft multiple invalid blocks to fill the main cache. This would make usFlushthe cache more often than necessary. Obviously this would be very expensive to do on mainnet.Introduce
CoinsViewOverlay, aCCoinsViewCachesubclass that reads coins without mutating the underlying cache viaFetchCoin().Add
PeekCoin()to look up a Coin through a stack ofCCoinsViewCachelayers without populating parent caches. This prevents the main cache from caching inputs pulled from disk for a block that has not yet been fully validated. OnceFlush()is called on the view, these inputs will be added as spent tocoinsCachein the main cache viaBatchWrite().This is the foundation for async input fetching, where worker threads must not mutate shared state.