Conversation
11b570e to
07f18c9
Compare
07f18c9 to
b061470
Compare
|
@dcorral could you please rebase this PR? |
b061470 to
16bf7fc
Compare
|
@zoedberg done! |
zoedberg
left a comment
There was a problem hiding this comment.
Thanks! I left some change requests. Also, instead of serializing structs to JSON strings and then convert the strings to bytes I would use bincode to serialize the structs directly in bytes.
5d04288 to
8024e97
Compare
| pub(crate) ldk_data_dir: PathBuf, | ||
|
|
||
| /// KVStore for RGB data persistence | ||
| pub(crate) rgb_kv_store: Arc<dyn KVStoreSync + Send + Sync>, |
There was a problem hiding this comment.
Why did you choose to use KVStoreSync instead of KVStore (its async version)? I would use the latter if possible, since RLN runs in an async environment
There was a problem hiding this comment.
I used KVStoreSync because all the call sites are in synchronous code paths: color_commitment, color_htlc, color_closing, update_rgb_channel_id, handle_funding, etc. are all sync functions inside channel.rs and channelmanager.rs, which are entirely sync.
Using the async KVStore would require making all of these async, which would mean changing a big chunk of LDK's internals. do you think it's worth doing that, or is KVStoreSync ok for now?
There was a problem hiding this comment.
What about doing like ldk-node does? Which is implementing an inner store and then implementing for the wrapper store both KVStore and KVStoreSync. That way we could use the KVStoreSync in LDK (where we don't have an async environment) and KVStore in RLN (where we do have an async environment)
There was a problem hiding this comment.
Looking into it, KVStoreSyncWrapper already exists and it wraps any KVStoreSync into KVStore by boxing the sync result into a future. So the rust-lightning side can stay as-is with KVStoreSync and on the RLN side we just wrap the sea-orm store with KVStoreSyncWrapper to expose the async KVStore trait where needed.
There was a problem hiding this comment.
But wouldn't this mean we are actually doing sync calls to the DB? What I would like to achieve is to actually do async calls. To me what SqliteStore and SqliteStoreInner in ldk-node are doing is a bit different, but I could be wrong.
There was a problem hiding this comment.
You're right, KVStoreSyncWrapper wraps a sync result in Box::pin(async move { res }), so it is still sync under the hood.
Since the rgb_kv_store calls are all sync, I'll keep the bound as Arc<dyn KVStoreSync + Send + Sync> in this PR then on the RLN side I could implement both KVStoreSync and KVStore on SeaOrmKvStore directly (like ldk-node), drop KVStoreSyncWrapper, and .await sea-orm futures directly in the async impl.
Let me know if I'm missing something, this is quite a topic!
There was a problem hiding this comment.
This seems the way to go!
There are some partially related implementation details that I think should be discussed though:
dyn KVStoreSyncforces dynamic dispatch on every read/write/remove/list at runtime. I would instead add aKV: KVStoreSync + Send + Sync + 'staticgeneric parameter and storeArc<KV>, this would give monomorphization (static dispatch at compile time) and be consistent with the existing pattern.- currently each
KVStoreSynccall does: detect runtime ->thread::scope-> spawn OS thread ->DB_RUNTIME.block_on(future)-> join. That adds a thread lifecycle overhead for each DB call. Moreover, theDB_RUNTIMEis a current thread runtime, which serializes every DB access through a single thread. With the current design, under high load, concurrent calls to the DB would block each other. So even if you raisemax_connections, the dedicated runtime would process futures one at a time. You should instead usetokio::task::block_in_place+Handle::current().block_on()and eliminate the dedicatedDB_RUNTIME.
There was a problem hiding this comment.
agreed on both points 🙏
Conclusions:
-
for this PR: I'll replace
Arc<dyn KVStoreSync + Send + Sync>with a genericKV: KVStoreSync + Send + Sync + 'staticonChannelContext,ChannelManager,OutboundPayments, and all constructors/deserialization. static dispatch + consistent with existing LDK patterns. -
on the RLN side: I'll implement both
KVStoreSyncandKVStoredirectly onSeaOrmKvStorelike ldk-node'sSqliteStore. theKVStoreimpl will.awaitsea-orm futures natively (real async, no wrapper). theKVStoreSyncimpl will useblock_in_place+Handle::current().block_on(), dropping the dedicatedDB_RUNTIMEand the per-callthread::scopeoverhead.KVStoreSyncWrappergoes away.
Let me know if you have any correction before I can jump to the implementation, thanks!
There was a problem hiding this comment.
I think you can proceed. Just a couple of notes:
on
ChannelContext,ChannelManager,OutboundPayments, and all constructors/deserialization
I haven't checked that you added it only where actually necessary, but generally speaking let's add the generic instead of using dyn
I'll implement both
KVStoreSyncandKVStoredirectly onSeaOrmKvStorelike ldk-node'sSqliteStore
Yes. I'm not sure why we also need RlnDatabase though, seems we could keep only that or SeaOrmKvStore.
| let rgb_info = kv_store.read_rgb_channel_info(&temp_chan_id, false).expect("rename ok"); | ||
| kv_store.write_rgb_channel_info(&chan_id, &rgb_info, false); | ||
| kv_store.remove_rgb_channel_info(&temp_chan_id, false).expect("rename ok"); |
There was a problem hiding this comment.
Having now the DB I think we can just make this an update, am I missing something?
There was a problem hiding this comment.
the KVStoreSync trait doesn't have a rename/update-key method though, so we'd need to either add one to the trait or handle it at the DB level in the RLN KVStore implementation. What's your suggestion?
There was a problem hiding this comment.
Right. Then let's keep the read, write and remove, but let's put them inside an update_rgb_channel_info method. Also, have you considered a way to avoid write concurrency? It would be nice to be able to use DB transactions but I'm not sure that's easy to achieve
There was a problem hiding this comment.
to get real atomicity we'd need to add a transaction() method to KVStoreSync that runs all operations within a single DB transaction. This requires refactoring SeaOrmKvStore to be generic over sea-orm's ConnectionTrait (since DatabaseTransaction and DatabaseConnection both implement it but are different types) and the fs based FilesystemStore in lightning-persister would also need to implement it somehow, which is non-trivial.
The main risk here is crash recovery rather than write concurrency since LDK already holds exclusive access per channel during these calls and this rename only happens once per channel at funding time.
There was a problem hiding this comment.
I'm not sure what do you intend here. Why do you say "requires refactoring SeaOrmKvStore to be generic over sea-orm's ConnectionTrait"? I don't see this requirement but maybe I'm over-simplifying this.
About write concurrency, my concern is broader, I was not talking about this method in particular.
There was a problem hiding this comment.
sea-orm's db.transaction(|txn| { ... }) closure API accepts any code that uses the transaction handle, so there's no need to make SeaOrmKvStore generic over ConnectionTrait. I can just use the closure inside specific methods.
That said, looking at this more carefully: update_rgb_channel_info is a default method on the RgbKvStoreExt extension trait with a blanket impl<T: KVStoreSync> RgbKvStoreExt for T. I can't override that default for SeaOrmKvStore specifically, so db.transaction() at the impl level wouldn't wrap the three calls that happen via the trait. Real atomicity for this path still requires either a trait change or just accepting the risk. Tbh since the rename only happens once per channel at funding and peer_state_mutex serializes access, I'd propose leaving atomicity out of scope for this PR unless you feel strongly.
On the broader concurrency concern: I traced every RGB KVStore write in LDK and they all happen under peer_state_mutex, which serializes per-channel access. the two pub functions that RLN calls directly (write_rgb_payment_info_file, should rename btw, and update_rgb_channel_amount) do bypass LDK's locks though. I'd add per-key write ordering in SeaOrmKvStore using the same pattern as FilesystemStore and ldk-node's SqliteStore: per-key Mutex inside a HashMap<String, Arc<Mutex>> + a monotonic AtomicU64 version counter assigned at call time, with stale writes skipped. That handles concurrency at the implementation level regardless of caller context.
There was a problem hiding this comment.
There's also another way to get DB transactions that doesnt't require closures, which is calling connection.begin().
Re-reading this thread I noticed I made some confusion, I'm sorry. So, I think there are 2 topics that need to be discusssed: write concurrency and write atomicity. Both need to be discussed in general, considering both LDK and RLN.
My proposal to add DB transactions is not specific to the update_rgb_channel_info method, it's a more general proposal that applies to all points where LDK and RLN perform write queries. I think guaranteeing write atomicity (potentially including rollbacks) is essential because in case of errors we don't want to end up with a partial state where some records are updated and others aren't, leading to data corruption or inconsistent business rules.
update_rgb_channel_info and the RgbKvStoreExt have been introduced by this PR, so I don't see why we can't consider changes there.
We can consider addressing atomicity at a later time of course, but I think it's worth it to consider it now to be sure introducing it will not significantly change the current design.
Adding a per-key write ordering and a version counter seems something we need to solve write concurrency, thanks for adding this to the discussion.
514cb56 to
5ac1202
Compare
This PR adds database persistence for RGB-specific data via the
KVStoreSynctrait, replacing direct filesystem operations with KVStore calls through dependency injection.Changes
New KVStore Functions (
rgb_utils/mod.rs)Added namespace constants and persistence functions:
New functions:
read_rgb_transfer_info/write_rgb_transfer_inforead_rgb_channel_info/write_rgb_channel_info/remove_rgb_channel_inforead_rgb_payment_info/write_rgb_payment_inforead_rgb_consignment/write_rgb_consignment/remove_rgb_consignmentget_rgb_channel_info_pendingis_payment_rgbfilter_first_hopsDependency Injection
KVStore is now passed through struct fields as
Arc<dyn KVStoreSync + Send + Sync>:ChannelManager.rgb_kv_storeChannelManagerReadArgs.rgb_kv_storeChannelContext.rgb_kv_storeOutboundPayments.rgb_kv_storeKeysManager.rgb_kv_storeInMemorySigner.rgb_kv_storeUpdated RGB Functions
All RGB functions now receive KVStore directly instead of filesystem paths:
color_commitment- Reads pending channel info and payment info from KVStore, writes transfer infocolor_htlc- Reads transfer info from KVStore, writes updated transfer infocolor_closing- Reads pending channel info from KVStore, writes transfer infois_channel_rgb- Checks KVStore for channel RGB dataget_rgb_channel_info- Reads from KVStorerename_rgb_files- Renames channel info and consignment entries in KVStorehandle_funding- Writes consignment and channel info to KVStoreupdate_rgb_channel_amount- Reads/writes channel info via KVStoreProxy Key Pattern for HTLC Payments
Payment info uses a proxy key scheme for HTLC tracking:
payment_hash(hex){channel_id}_{payment_hash}{payment_hash}_pendingDesign
rgbprimary namespace with typed secondary namespacesRgbInfo,RgbPaymentInfo, andTransferInfoserialized to KVStoreBreaking Changes
The following public functions now require a
kv_storeparameter instead of filesystem paths:is_channel_rgbkv_store: &KwhereK: KVStoreSyncupdate_rgb_channel_amountkv_store: &KwhereK: KVStoreSyncwrite_rgb_payment_info_filekv_store: &KwhereK: KVStoreSync(also removedldk_data_dir)color_htlckv_store: &dyn KVStoreSynccolor_closingkv_store: &dyn KVStoreSynchandle_fundingkv_store: &KwhereK: KVStoreSyncStruct constructors (
ChannelManager,KeysManager,InMemorySigner,OutboundPayments) now require anrgb_kv_storeparameter.