Skip to content

Reconstruct OutboundPayments from ChannelMonitors#1

Open
valentinewallace wants to merge 4 commits intomainfrom
2025-10-reconstruct-mgr-outbound-pmts
Open

Reconstruct OutboundPayments from ChannelMonitors#1
valentinewallace wants to merge 4 commits intomainfrom
2025-10-reconstruct-mgr-outbound-pmts

Conversation

@valentinewallace
Copy link
Copy Markdown
Owner

We have an overarching goal of getting rid of ChannelManager persistence and
rebuilding the whole ChannelManager from existing ChannelMonitors, due to
issues when the two structs are out-of-sync on restart. The main issue that can
arise is channel force closure.

Here we start this process by rebuilding ChannelManager::outbound_payments from
the ChannelMonitors.

This serialization was in place for compatibility with LDK 0.0.101, which
should no longer be necessary.
This makes it easier to see what exactly is failing upon test failure.
As part of debugging this test while implementing rebuilding
ChannelManager::outbound_payments from Channel{Monitor}s, I added some extra
checks in this test. They seemed useful for improving readability since this
test follows 3 interleaved payments, so thought it was worth committing the
changes.
We have an overarching goal of getting rid of ChannelManager persistence and
rebuilding the whole ChannelManager from existing ChannelMonitors, due to
issues when the two structs are out-of-sync on restart. The main issue that can
arise is channel force closure.

Here we start this process by rebuilding ChannelManager::outbound_payments from
the ChannelMonitors.
@valentinewallace
Copy link
Copy Markdown
Owner Author

@joostjager not sure how to request you as a reviewer on my fork so tagging you manually for now


write_tlv_fields!(writer, {
(1, pending_outbound_payments_no_retry, required),
// TLV 1 used to be used for pending_outbound_payments_no_retry
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message:

which should no longer be necessary.

Explain 'should'?


#[test]
fn test_dup_htlc_onchain_doesnt_fail_on_reload() {
fn test_dup_htlc_onchain_doesnt_fail_on_reload_0() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunate that rust doesn't have sub-tests. The numbering isn't ideal, but trying to come up with something descriptive probably isn't going to work either...

) where
F: Fn(usize, &ChannelMonitorUpdate) -> bool,
{
if let Some(chain_monitor) = node.chain_monitor() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let chain_monitor = node.chain_monitor().expect("Missing chain monitor"); ?

&nodes[0],
chan_id,
2,
|upd_idx, upd: &ChannelMonitorUpdate| {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it simpler to just return the updates, and do this matching afterwards instead of using the closure?

(3, pending_outbound_payments, option),
// We now rebuild `OutboundPayments` from the `ChannelMonitor`s as part of getting rid of
// `ChannelManager` persistence.
(3, _pending_outbound_payments, option),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be useful to still read this and do a debug comparison with the reconstructed data? To maybe catch things that aren't covered in tests.

);
}
}
for (channel_id, monitor) in args.channel_monitors.iter() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context question: the two loops through args.channel_monitors can't be combined because in the first pass the pending_outbounds need to be reconstructed completely?

// `pending_intercepted_htlcs`, we were apparently not persisted after
// the monitor was when forwarding the payment.
decode_update_add_htlcs.retain(|src_outb_alias, update_add_htlcs| {
for (htlc_source, (htlc, preimage_opt)) in monitor.get_all_current_outbound_htlcs()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the is_channel_closed check, I think also more HTLCSource::PreviousHopData htlcs are processed, but this isn't necessary for payment reconstruction?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a few lines of comment describing the purposes of the two loops.

bolt12_invoice,
session_priv,
path,
is_channel_closed,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First the claim was only executed for closed channels, but now for all htlcs. Is that the intention?

fee_paid_msat,
bolt12_invoice: bolt12_invoice,
}, ev_completion_action.take()));
let duplicate_sent_event = pending_events.iter().find(|(ev, _)| match ev {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment how duplicate sent event can happen?

}
// Because we reload outbound payments from the existing `ChannelMonitor`s, and no HTLC exists
// in those monitors after restart, we will currently forget about the payment and fail to
// generate a `PaymentFailed` event.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did we say about this? Wasn't this undesirable?

@joostjager
Copy link
Copy Markdown

Is it really not possible to aim for merging this PR to main?

@valentinewallace
Copy link
Copy Markdown
Owner Author

Is it really not possible to aim for merging this PR to main?

Feel free to respond in the discord thread with Matt!

@joostjager
Copy link
Copy Markdown

Summarizing that thread here:

@TheBlueMatt's point is that reconstructing outbound payments only from the monitors isn’t a strict improvement to LDK. It’s more of a step toward a larger goal. That eventual improvement would likely require a separate store for unfinished payments, so making this change now (especially without a cfg flag) might be fine as an intermediate step but not a clear win on its own.

With a cfg flag, we wouldn't need to worry about functional regression, and could still avoid merging to a dev branch. For this PR, it is a question of how easy the new and old code can be switched using a cfg flag without too much copy paste? I have the same question for lightningdevkit#4151.

valentinewallace pushed a commit that referenced this pull request Mar 19, 2026
When the counterparty initiates an RBF and we have no new contribution
queued via QuiescentAction, we must re-use our prior contribution so
that our splice is not lost. Track contributions in a new field on
PendingFunding so the last entry can be re-used in this scenario.

Each entry stores the feerate-adjusted version because that reflects
what was actually negotiated and allows correct feerate re-adjustment
on subsequent RBFs. Only explicitly provided contributions (from a
QuiescentAction) append to the vec. Re-used contributions are replaced
in-place with the version adjusted for the new feerate so they remain
accurate for further RBF rounds, without growing the vec.

Add test_splice_rbf_acceptor_recontributes to verify that when the
counterparty initiates an RBF and we have no new QuiescentAction
queued, our prior contribution is automatically re-used so the splice
is preserved.

Add test_splice_rbf_recontributes_feerate_too_high to verify that when
the counterparty RBFs at a feerate too high for our prior contribution
to cover, the RBF is rejected rather than proceeding without our
contribution.

Add test for sequential RBF splice attempts

Add test_splice_rbf_sequential that exercises three consecutive RBF
rounds on the same splice (initial → RBF #1 → RBF lightningdevkit#2) to verify:
- Each round requires the 25/24 feerate increase (253 → 264 → 275)
- DiscardFunding events reference the correct funding txid from each
  replaced candidate
- The final RBF splice can be mined and splice_locked successfully

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants