Skip to content

feat(encryption) [7/N] Read encrypted manifest list#2453

Open
xanderbailey wants to merge 17 commits into
apache:mainfrom
xanderbailey:xb/encryption-7-table-wiring
Open

feat(encryption) [7/N] Read encrypted manifest list#2453
xanderbailey wants to merge 17 commits into
apache:mainfrom
xanderbailey:xb/encryption-7-table-wiring

Conversation

@xanderbailey
Copy link
Copy Markdown
Contributor

@xanderbailey xanderbailey commented May 14, 2026

Which issue does this PR close?

What changes are included in this PR?

Wires encryption support into the Table type so that encrypted manifest lists can be read. Adds an EncryptionManager accessor on Table and ensures it is supplied wherever manifest lists are loaded from snapshots.

Also threads the EncryptionManager through purge_table / drop_table_data so that dropping an encrypted table can read its manifest lists and clean up referenced files correctly.

A design decision was made whilst reviewing the RFC to keep the EM out of FileIo and pass it around separately which turns out to be much cleaner.

Note that we aren't exposing any sort of configuration for the KMS when building the tables in the catalogs. That will come later. That PR is deliberately coming later to minimise the public surface area whilst this feature is under development.

Also note that StaticTable doesn't have the KMS wired in yet since we'd need to merge the KmsFactory to make that code path valid. Similar story is true for wiring this through the catalogs.

Are these changes tested?

Yes

io: &FileIO,
metadata: &TableMetadata,
metadata_location: Option<&str>,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public api change here

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
&self,
file_io: &FileIO,
table_metadata: &TableMetadata,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public api change here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had some discussion before, and I think we should add sth like ManifestListLoader to avoid such api changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The api should be as following:

impl Table {
pub fn manifest_list_loader(&self, s: &Snapshot) -> Result<ManifestListLoader> {
   ...
}
}

With this approach, we could hide all table related constructs like Runtime, Cache, FileIO to avoid api changes in future.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 it would be nice to encapsulate these public apis now rather than scramble when Iceberg-Rust eventually preps for its 1.0 release. @blackmwk do you feel strongly about adding a manifest_list_loader method in this PR or a future one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the loader in this PR, let me know what you think!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to move it into a standalone pr.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move the manifest list reader to it's own PR now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///
/// Returns `Ok(None)` if the property is not set. Returns an error if the
/// property is set but no [`KeyManagementClient`] was provided.
fn maybe_configure_encryption(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't exactly sure if this should live here or if it's a concern of the EM?

metadata_location: Option<String>,
metadata: Option<TableMetadataRef>,
identifier: Option<TableIdent>,
kms_client: Option<Arc<dyn KeyManagementClient>>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on the encryption support @xanderbailey, this is really great.

I think TableMetadata::table_properties has all the information we need to construct either a KeyManagementClient or a EncryptionManager. Do you think we could build them as part of the builder instead of passing KeyManagementClient in the API?

Copy link
Copy Markdown
Contributor Author

@xanderbailey xanderbailey May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look! In Java's case the kms is configured at the catalog level (CatalogProperties.ENCRYPTION_KMS_*) and I think we should do the same, the catalog will take some KmsFactory and construct the KMS for the table? WDYT?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, thanks @xanderbailey.

I think it is closer to the encryption() method in Java's HiveTableOperations.java

pub const PROPERTY_ENCRYPTION_KEY_ID: &str = "encryption.key-id";

/// Property key for the encryption data encryption key (DEK) length in bytes.
pub const PROPERTY_ENCRYPTION_DATA_KEY_LENGTH: &str = "encryption.data-key-length";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding it. I think I can get rid of the additional argument for data key length In introduced in #2466

@xanderbailey xanderbailey force-pushed the xb/encryption-7-table-wiring branch from d2d19e7 to 3bb548e Compare May 19, 2026 11:20
Copy link
Copy Markdown
Collaborator

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xanderbailey! First initial pass, trying to refer back to RFC and spec. I think this looks right:

  • The "EM lives on Table, passed explicitly, FileIO stays encryption-unaware" shape is exactly the decision in RFC #2183 thread.
  • TableBuilder::maybe_configure_encryption is the function the RFC literally names (RFC §"Catalog Integration", Step 2) with the same three-step logic.
  • v3-only enforcement matches the format spec (Appendix E) and RFC §"Format Version Requirement".
  • Putting kms_client(...) directly on TableBuilder (rather than the RFC's catalog-level KmsClientFactory) is acknowledged as interim in the PR description.
  • decrypt_manifest_list_key_metadata(key_id: &str) -> StandardKeyMetadata is the signature negotiated in #2383.
  • Snapshot key-id is optional in v3 per spec.md §Snapshots, so (None, _) reading via an EM-equipped table is spec-valid.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
let manifest_list_content = match (&self.encryption_key_id, encryption_manager) {
(Some(_), None) => {
return Err(Error::new(
ErrorKind::FeatureUnsupported,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do ErrorKind::PreconditionFailed or ErrorKind::Unexpected here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precondition seems correct

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
let input = file_io.new_input(&self.manifest_list)?;
EncryptedInputFile::new(input, key_metadata).read().await?
}
(None, _) => file_io.new_input(&self.manifest_list)?.read().await?,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's non-obvious that there could be an intentionally unused EncryptionManager here, since a table could have unencrypted snapshots in its history. Maybe worth a one-line comment so someone doesn't change this to (None, None).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be nice small place to improve on what Java does, even if its in a follow up comment only PR.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated

use bytes::Bytes;

use crate::encryption::kms::{KeyManagementClient, MemoryKeyManagementClient};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hoist these up near 539?

Comment thread crates/iceberg/src/table.rs Outdated

let kms_client = kms_client.ok_or_else(|| {
Error::new(
ErrorKind::FeatureUnsupported,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same ErrorKind feedback as before. Whatever we pick should be consistent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Data files within manifests are only deleted if the `gc.enabled` table
/// property is `true` (the default), to avoid corrupting other tables that
/// may share the same data files.
pub async fn drop_table_data(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for this code path, even just to make sure future signature changes don't break reading manifest lists?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to add more tests here when we had all the decryption stuff in, else I'd have to awkwardly construct an encrypted manifest list with unencrypted manifest files which seemed a little wonky to me. WDYT?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can defer as long as end up with a test for the codepath. I know it's hard when the features comes in pieces like this.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
&self,
file_io: &FileIO,
table_metadata: &TableMetadata,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had some discussion before, and I think we should add sth like ManifestListLoader to avoid such api changes.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
&self,
file_io: &FileIO,
table_metadata: &TableMetadata,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The api should be as following:

impl Table {
pub fn manifest_list_loader(&self, s: &Snapshot) -> Result<ManifestListLoader> {
   ...
}
}

With this approach, we could hide all table related constructs like Runtime, Cache, FileIO to avoid api changes in future.

));
}

let kms_client = kms_client.ok_or_else(|| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Why user must provide a kms_client for table property?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a table_key_id is present (encryption.key-id) then the the user must provide a KMS, the property is telling us that the table is encrypted. If we weren't to fail loudly here I think users would end up with some cryptic deserialize error when they try and read an encrypted manifest list.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log a warning if encryption key is not set, but kms client is provided.

@xanderbailey
Copy link
Copy Markdown
Contributor Author

Thanks for the review @blackmwk. I've introduced the ManifestListLoader a3f165e and made load_manifest_list pub crate. Note that the original load_manifest_list is still used directly by the object cache and in the catalog utils, I think both of these make sense to leave?

@xanderbailey xanderbailey force-pushed the xb/encryption-7-table-wiring branch 2 times, most recently from 296ccbf to f3d4a9b Compare May 21, 2026 09:33
@xanderbailey
Copy link
Copy Markdown
Contributor Author

Thanks so much for your reviews so far @blackmwk, would you mind taking another quick look here when you've got a minute?

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
&self,
file_io: &FileIO,
table_metadata: &TableMetadata,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 it would be nice to encapsulate these public apis now rather than scramble when Iceberg-Rust eventually preps for its 1.0 release. @blackmwk do you feel strongly about adding a manifest_list_loader method in this PR or a future one?

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
let input = file_io.new_input(&self.manifest_list)?;
EncryptedInputFile::new(input, key_metadata).read().await?
}
(None, _) => file_io.new_input(&self.manifest_list)?.read().await?,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be nice small place to improve on what Java does, even if its in a follow up comment only PR.

pub cdc_norm_level: i32,
/// The master key id used to encrypt this table's manifest list and data
/// files. `None` if `encryption.key-id` is not set.
pub encryption_key_id: Option<String>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style nit pick: we could utilize Rust's match and type system if we had a TableProperties without this and a separate struct for EncryptedTableProperties. The encrypted struct would look like

struct EncryptedTableProperties {
    /// The master key id used to encrypt this table's manifest list and data
    /// files. `None` if `encryption.key-id` is not set.
    pub encryption_key_id: Option<String>,
    ...TableProperties,
}

Then when you need to check to use encrypted vs non-encrypted code paths you would just utilize match and check the type of the table's properties.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if ...TableProperties is valid rust is it? Do you mean copy all of the fields out and have an enum for table properties? Seems like adding a new struct here would just be added indirection, WDYT?

Comment thread crates/iceberg/src/table.rs Outdated
}

/// Loads a [`ManifestList`] for a snapshot.
pub struct ManifestListLoader<'a> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to manifest_list module.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I made a mistake, we already have a ManifestListWriter, maybe we should call this ManifestListReader?

Comment thread crates/iceberg/src/table.rs Outdated
/// Loads and returns the [`ManifestList`] for this snapshot.
pub async fn load(&self) -> Result<ManifestList> {
self.snapshot
.load_manifest_list(self.file_io, self.table_metadata, self.encryption_manager)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move the logic of loading manifest list into this method, and deprecate the load_manifest_list in Snapshot.

Comment thread crates/iceberg/src/table.rs Outdated

// Encryption is a v3 feature: `encryption-keys` table metadata and the
// snapshot `key-id` field are introduced in format version 3.
if metadata.format_version() < FormatVersion::V3 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is odd to me. I think the correct one should be like:

if format_version < v3 {
   return Ok(None)
}

// Now we are v3
if encryption_key_id is none {
  return None
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to make that change, I was trying to test the invariants more explicitly, i.e if we end up with a table with encryption_key_id and you're not V3 then we should shout loudly. But I'm happy with your suggestion

));
}

let kms_client = kms_client.ok_or_else(|| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me.

));
}

let kms_client = kms_client.ok_or_else(|| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log a warning if encryption key is not set, but kms client is provided.

Comment thread crates/iceberg/src/spec/snapshot.rs Outdated
&self,
file_io: &FileIO,
table_metadata: &TableMetadata,
encryption_manager: Option<&EncryptionManager>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to move it into a standalone pr.

@xanderbailey
Copy link
Copy Markdown
Contributor Author

Have moved ManifestListReader out to #2541 which we can merged before this.

@xanderbailey xanderbailey force-pushed the xb/encryption-7-table-wiring branch from f3d4a9b to 2626710 Compare May 30, 2026 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants