Skip to content

Conversation

@andrewtoth
Copy link
Contributor

@andrewtoth andrewtoth commented Dec 28, 2025

This is a slightly modified version of the first few commits of #31132, which can be merged as an independent change. It has a small benefit on its own, but will help in moving the parent PR forward.

When accessing coins via the CCoinsViewCache, methods like GetCoin can call FetchCoin which actually mutate cacheCoins internally to cache entries when they are pulled from the backing db. This is generally a performance improvement for single threaded access patterns, but it precludes us from accessing entries in a CCoinsViewCache from multiple threads without a lock.

Another aspect is that when we use the resettable CCoinsViewCache view backed by the main cache for use in ConnectBlock(), we will insert entries into the main cache even if the block is determined to be invalid. This is not the biggest concern, since an invalid block requires proof-of-work. But, an attacker could craft multiple invalid blocks to fill the main cache. This would make us Flush the cache more often than necessary. Obviously this would be very expensive to do on mainnet.

Introduce CoinsViewOverlay, a CCoinsViewCache subclass that reads coins without mutating the underlying cache via FetchCoin().

Add PeekCoin() to look up a Coin through a stack of CCoinsViewCache layers without populating parent caches. This prevents the main cache from caching inputs pulled from disk for a block that has not yet been fully validated. Once Flush() is called on the view, these inputs will be added as spent to coinsCache in the main cache via BatchWrite().

This is the foundation for async input fetching, where worker threads must not mutate shared state.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 28, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34165.

Reviews

See the guideline for information on the review process.

Type Reviewers
Stale ACK l0rinc, sipa, sedited, willcl-ark, ryanofsky

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34132 (coins: drop error catcher, centralize fatal read handling by l0rinc)
  • #34124 (refactor: make CCoinsView a pure virtual interface by l0rinc)
  • #33512 (coins: use number of dirty cache entries in flush warnings/logs by l0rinc)
  • #31132 (validation: fetch block inputs on parallel threads 3x faster IBD by andrewtoth)
  • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation talks about block connection but the implementation was added to block disconnect. I don't yet know if we should apply it there or not, but I find it very likely that we should do it for block connection.
A lot of tests were added for testing the new cache in isolation and they all passed while it was applied to the wrong place as far as I can tell.

We should add a new integration test that actually checks that invalid blocks don't pollute the underlying cache:

diff --git a/src/test/validation_chainstate_tests.cpp b/src/test/validation_chainstate_tests.cpp
--- a/src/test/validation_chainstate_tests.cpp	(revision 20f0b8304d089d641060f693d3284e69524fbcbe)
+++ b/src/test/validation_chainstate_tests.cpp	(date 1767360120004)
@@ -3,10 +3,12 @@
 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
 //
 #include <chainparams.h>
+#include <consensus/amount.h>
 #include <consensus/validation.h>
 #include <node/kernel_notifications.h>
 #include <random.h>
 #include <rpc/blockchain.h>
+#include <script/script.h>
 #include <sync.h>
 #include <test/util/chainstate.h>
 #include <test/util/coins.h>
@@ -63,6 +65,30 @@
     }
 }
 
+BOOST_FIXTURE_TEST_CASE(connect_tip_does_not_cache_inputs_on_failed_connect, TestChain100Setup)
+{
+    Chainstate& chainstate = Assert(m_node.chainman)->ActiveChainstate();
+
+    COutPoint outpoint;
+    {
+        LOCK(cs_main);
+        outpoint = AddTestCoin(m_rng, chainstate.CoinsTip());
+        chainstate.CoinsTip().Flush(/*will_reuse_cache=*/false);
+    }
+
+    CMutableTransaction tx;
+    tx.vin.emplace_back(outpoint);
+    tx.vout.emplace_back(MAX_MONEY, CScript{} << OP_TRUE);
+
+    const auto tip{WITH_LOCK(cs_main, return chainstate.m_chain.Tip()->GetBlockHash())};
+    CBlock block = CreateBlock({tx}, CScript{} << OP_TRUE, chainstate);
+    Assert(m_node.chainman)->ProcessNewBlock(std::make_shared<CBlock>(block), true, true, nullptr);
+
+    LOCK(cs_main);
+    BOOST_CHECK_EQUAL(tip, chainstate.m_chain.Tip()->GetBlockHash()); // block rejected
+    BOOST_CHECK(!chainstate.CoinsTip().HaveCoinInCache(outpoint));    // input not cached
+}
+
 //! Test UpdateTip behavior for both active and background chainstates.
 //!
 //! When run on the background chainstate, UpdateTip should do a subset
diff --git a/src/validation.cpp b/src/validation.cpp
--- a/src/validation.cpp	(revision 20f0b8304d089d641060f693d3284e69524fbcbe)
+++ b/src/validation.cpp	(date 1767360154724)
@@ -3102,7 +3102,7 @@
     LogDebug(BCLog::BENCH, "  - Load block from disk: %.2fms\n",
              Ticks<MillisecondsDouble>(time_2 - time_1));
     {
-        CCoinsViewCache view(&CoinsTip());
+        CoinsViewCacheNonMutating view(&CoinsTip());
         bool rv = ConnectBlock(*block_to_connect, state, pindexNew, view);
         if (m_chainman.m_options.signals) {
             m_chainman.m_options.signals->BlockChecked(block_to_connect, state);

Concept ACK (edited to remove the nack).

@andrewtoth
Copy link
Contributor Author

Thank you @l0rinc, I have taken your integration test. I had unit tests for this behavior but the integration test is a great idea.

l0rinc added a commit to l0rinc/DrahtBot that referenced this pull request Jan 2, 2026
The summary comment parser treated "Concept ACK, but approach NACK." as a `Concept NACK` because it only matched "Approach NACK" with the exact capitalization.

Make the "Approach NACK" pattern case-insensitive and add a regression test.

Ref: bitcoin/bitcoin#34165 (review)
@andrewtoth andrewtoth force-pushed the dont-mutate-cache branch 2 times, most recently from 410a014 to 7a6e616 Compare January 3, 2026 19:21
@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 3, 2026

🚧 At least one of the CI tasks failed.
Task macOS native: https://github.com/bitcoin/bitcoin/actions/runs/20681569759/job/59376499348
LLM reason (✨ experimental): Blockchain tests aborted due to a DisconnectTip failure during tip/disconnect handling (BLOCK_FAILED_VALID), causing the CI run to fail.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll run a reindex-chainstate on these to see how they behave - as long as they don't slow down anything, I would be fine with them.

l0rinc added a commit to l0rinc/DrahtBot that referenced this pull request Jan 4, 2026
The summary comment parser treated "Concept ACK, but approach NACK." as a `Concept NACK` because it only matched "Approach NACK" with the exact capitalization.

Make the "Approach/Concept N?ACK" pattern capitalization-insensitive and add regression tests.

Ref: bitcoin/bitcoin#34165 (review)
Ticks<MillisecondsDouble>(time_2 - time_1));
{
CCoinsViewCache view(&CoinsTip());
CoinsViewCacheNonMutating view(&CoinsTip());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here or in a follow-up we could add this to TestBlockValidity as well, where it would be even more relevant since it’s invoked with without PoW checks (so your PR arguments apply here even more)

@l0rinc
Copy link
Contributor

l0rinc commented Jan 11, 2026

Ran a reindex-chainstate, no measurable change compared to master: 👍

ACK 2e67c5b

Copy link
Member

@willcl-ark willcl-ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice carve-out from #31132.

Left a comment about whether we want to add a CoinsViewErrorCatcher wrapper for PeekCoin, but otherwise looks good to me.

@andrewtoth
Copy link
Contributor Author

Thank you for your review @willcl-ark. Addressed #34165 (comment) and #34165 (comment), and added noexcept to PeekCoin.

git diff 2e67c5b604d2cf2e56db6b5d2906391d22113941..3a2cfbea1957f1eed6d97b6f8924f13100e11b1c

Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 70e93a1, with a small suggestion

Copy link
Contributor

@sedited sedited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 70e93a1

Copy link
Member

@willcl-ark willcl-ark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK 70e93a1

Had this branch reindex 50k blocks and then left it to run overnight for a few newly-seen blocks without issue. The code changes generally look good to me, and the rationale (as a building block of #31132) makes sense.

I did not run any of the fuzz tests.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 70e93a1. Behavior change seems very good and I did not see any problems with the implementation. But it does seem like it could be simplified. Like I don't understand why this wouldn't just add a bool use_peek option to CCoinsViewCache instead of introducing a new CoinsViewOverlay class and GetCoinFromBase override. Also I don't understand why CoinsViewOverlay seems to be tested in some different and more complicated ways than CCoinsViewCache. (Details in comments below.)

I could be missing some context, and this PR does have a lot of acks so may not be worth changing. But I suspect various simplifications and cleanups could be possible here, and maybe done in followups.

TRACEPOINT_SEMAPHORE(utxocache, uncache);

std::optional<Coin> CCoinsView::GetCoin(const COutPoint& outpoint) const { return std::nullopt; }
std::optional<Coin> CCoinsView::PeekCoin(const COutPoint& outpoint) const { return GetCoin(outpoint); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "coins: add PeekCoin()" (c40f6d7)

Two issues I think would be good to address in this commit:

  • PeekCoin is documented as not modifying the cache, but the default implementation calls GetCoin can modify it.
  • GetCoin and HaveCoin should probably have documentation saying now that they may modify the cache, to contrast with PeekCoin

The problem with default PeekCoin implementation calling GetCoin could be minimally addressed by just not providing any PeekCoin implementation, which would be in keeping with #34124. IMO, a nicer solution would be to make PeekCoin, GetCoin, and HaveCoin all non-virtual methods, with simpler and more clearly distinguished hooks for subclasses to implement:

    virtual bool LookupCoin(const COutPoint& outpoint, Coin* coin = nullptr) const = 0;
    virtual bool MutableLookupCoin(const COutPoint& outpoint, Coin* coin = nullptr) = 0;

642fa06 is a full implementation to show what this would look like.

More ideally all of this should just be enforced with const. Only methods that don't mutate the cache should be const and others should be non-const. However, this would be a bigger change since const is currently being used to guard against BatchWrite being called. To fix that it would make sense to drop BatchWrite from the coins "view" interface and add it to a separate "store" interface.

So this is kind of a rabbithole. But I think it's important to at least document these methods better and to prevent the default PeekCoin implmentation from being able to modify the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR does have a lot of acks so may not be worth changing

I don't mind re-reviewing, your suggestions are always very useful.

642fa06 is a full implementation to show what this would look like.

This kinda' undoes #30849 and has a lot of renames, but I like the new comments.

More ideally all of this should just be enforced with const.

However, this would be a bigger change

👍 - we should do that in a follow-up!

would make sense to drop BatchWrite from the coins "view" interface

Yes, we will have to segregate that interface, it has multiple issues, but first we need to gradually clean up usage to fully understand the narrowest interfaces, see #34280

at least document these methods better and to prevent the default PeekCoin implementation

Agree, I also won't mind if we do that in follow-up since we will be touching all of these in other cleanup PRs as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a simpler solution for now is to just make the default PeekCoin return std::nullopt like GetCoin, and override PeekCoin in CCoinsViewDB.

Copy link
Contributor Author

@andrewtoth andrewtoth Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the default PeekCoin return std::nullopt, so subclasses have to explicitly opt in. This should pair well with #34124 as well.

Also took some of your comments from 642fa06 and added you as a co-author.

I think the further improvements re: LookupCoin and constness are interesting but could be better explored in a more focused follow-up.

Copy link
Contributor Author

@andrewtoth andrewtoth Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this change. I think it's safe to forward to GetCoin here at the base. This would only be an issue if there was a stack like CCoinsViewCache -> CCoinsView -> CCoinsViewCache. In production code, there is only really CoinsViewOverlay -> CCoinsViewCache -> CCoinsViewDB. It doesn't matter for the db to have PeekCoin because it doesn't cache anything. I think this is ok to merge as is, and further improvements will follow in #34124.

Comment on lines +81 to +82
Reset();
for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.NextAndMaybeErase(*it)) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "fuzz: add target for CoinsViewOverlay" (babb80d)

What's the implication of clearing the cache, undoing this write and all previous writes here? Does it make the test coverage less meaningful when this happens?

The fact that reseting this cache is ok me question what the point is of calling TestCoinsView with 3 layers of views (CCoinsView + MutationGuardCoinsViewCache + CoinsViewOverlay) when the existing fuzz tests are fine with just two layers (CCoinsView + CCoinsViewCache) and (CCoinsViewDB + CCoinsViewCache).

Why not follow the previous pattern and test (CCoinsView + CoinsViewOverlay) and (CCoinsViewDB + CoinsViewOverlay), just overriding the CoinsViewOverlay::BatchWrite method to do the
assert(ComputeCacheCoinsSnapshot() == m_expected_snapshot) check? This would seem more consistent and avoid the need for extra complexity and reduced reliability of the test this Reset() call seems to add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the implication of clearing the cache, undoing this write and all previous writes here? Does it make the test coverage less meaningful when this happens?

If this exception is thrown it puts the two caches in an invalid state. In production code we do not catch this error and crash. In other parts of this test we set good_data = false; when we want the fuzzer to not go further, but here we would already be in an invalid state so we need to reset to continue.

what the point is of calling TestCoinsView with 3 layers of views

CoinsViewOverlay is designed to not mutate the parent CCoinsViewCache. So, we need a CoinsViewOverlay backed via a CCoinsViewCache in this case. However, a CCoinsViewCache cannot exist by itself without a backing view, so we need to add a third CCoinsView to it.

@sipa
Copy link
Member

sipa commented Feb 10, 2026

Those seem like very useful review comments, @ryanofsky. I'm happy to re-review if any of them were to be addressed in this PR.

@andrewtoth
Copy link
Contributor Author

Thank you for your review @ryanofsky! I have taken some of your suggestions and left clarifying comments for others.

git diff 70e93a10b7fad61483bc9044b483460a7faaa1b8..6ca865fb2bd82b1034c6946d27640999f8135c89

src/txdb.cpp Outdated

std::optional<Coin> CCoinsViewDB::PeekCoin(const COutPoint& outpoint) const
{
return GetCoin(outpoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be vice-versa for safety, i.e. GetCoin delegating to PeekCoin so that we narrow the contract instead of expanding it? Same for coinscache_sim.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right! We can also solve the issue pointed out in #34165 (comment) by making CCoinsView::GetCoin delegate to PeekCoin instead of vice-versa. This makes the base case non-mutating and avoids a footgun of having to define both methods. Otherwise one path would return a Coin the other nullopt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is only safe if PeekCoin is a pure virtual method so subclasses can't miss overriding it.

@andrewtoth
Copy link
Contributor Author

Thanks @l0rinc and @maflcko, updated to use PeekCoin by default and delegate to it if GetCoin is not overridden.

git diff 6ca865fb2bd82b1034c6946d27640999f8135c89..5b524f311ed74a751040494735b9a47f5c16f0b2

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review 5b524f3. Thanks for the updates and explanations. Since last review, GetCoin was changed to delegate to PeekCoin instead of the reverse, which resolves my original concern about PeekCoin implementations unintentionally modifying the cache, but raises other concerns (see below). Other than that, some comments were just moved and added, some code moved between commits, and GetCoinFromBase was renamed (thanks!).

If my concerns about GetCoin calling PeekCoin aren't real or aren't important, would be happy to re-ack. Otherwise it might be better to go back to the previous approach, or just make these methods pure-virtual and accept a little boilerplate in the subclasses.

At a very high level, I will say the approach of making sure connect block code does not modify the cache by passing a different coinsview at runtime does not seem ideal to me. More ideal would be for the coins view to have a const method for looking up without modifying the cache, and a non-const method for looking up with caching, and the connect block code would be changed directly to use non-caching lookups, and the compiler would enforce this because it is passed a const view. Still, this runtime implementation seems like an improvement over status quo, and I'm probably not seeing the bigger picture here since these changes are used in follow up PRs

andrewtoth and others added 6 commits February 12, 2026 21:31
Introduce a helper to look up a Coin through a stack of CCoinsViewCache layers without populating parent caches.

This is useful for ephemeral views (e.g. during ConnectBlock) that want to avoid polluting CoinsTip() when validating invalid blocks.

Co-authored-by: l0rinc <[email protected]>
Co-authored-by: Pieter Wuille <[email protected]>
Co-authored-by: Ryan Ofsky <[email protected]>
Introduce `CoinsViewOverlay`, a `CCoinsViewCache` subclass that reads
coins without mutating the underlying cache via `FetchCoin()`.

Use `PeekCoin()` to look up a Coin through a stack of `CCoinsViewCache` layers without populating parent caches. This prevents the main cache from caching inputs pulled from disk for a block that has not yet been fully validated. Once `Flush()` is called on the view, these inputs will be added as spent to `coinsCache` in the main cache via `BatchWrite()`.

This is the foundation for async input fetching, where worker threads must not
mutate shared state.

Co-authored-by: l0rinc <[email protected]>
Use `CoinsViewOverlay` when connecting blocks in `ConnectTip`.

Add a new integration test to verify that using
CoinsViewOverlay does not mutate the main cache
during validation for an invalid block.

Co-authored-by: l0rinc <[email protected]>
Refactor TestCoinsView() to accept the cache as a parameter instead of
creating it internally. This prepares for adding a CoinsViewOverlay
fuzz target that needs to pass in a different cache type.

This is a non-functional change.

Co-authored-by: l0rinc <[email protected]>
Refactor TestCoinsView() to move code that directly modifies
backend_coins_view to the end of the function.
This prepares for a CoinsViewOverlay fuzz target that asserts
the backend_coins_view is not mutated by any methods before
BatchWrite is called.

Co-authored-by: l0rinc <[email protected]>
@andrewtoth
Copy link
Contributor Author

Thanks again for your review @ryanofsky. I reverted the change to PeekCoin. From the previously well ACK'd 70e93a1, the changes are:

  • rename GetCoinFromBase -> FetchCoinFromBase
  • remove the last commit, which touched an unrelated part of the fuzz coins_view target.
  • added coverage for PeekCoin in coinscache_sim.
  • Added and improved more comments throughout.

git diff 70e93a10b7fad61483bc9044b483460a7faaa1b8..cae6d895f8a8cf5f57e05519536fda5d62b10841

@ircinco-mh ircinco-mh mentioned this pull request Feb 13, 2026
Closed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants