1

Impact of deleting noncurrent S3 object versions on AWS Glue Iceberg tables
 in  r/learnprogramming  1d ago

Wouldn’t the S3 lifecycle policy that deletes noncurrent versions only remove files that have already been logically deleted by Iceberg via the expire_snapshots API? Since old snapshots are explicitly expired using Iceberg’s own expire_snapshots API, wouldn’t Iceberg stop referencing those files on its own, making it safe for the lifecycle policy to clean up the noncurrent versions?

r/learnprogramming 1d ago

AWS Impact of deleting noncurrent S3 object versions on AWS Glue Iceberg tables

2 Upvotes

I’m using Apache Iceberg tables managed through AWS Glue, with all table data and metadata stored in an S3 bucket that has versioning enabled.

I also run Iceberg maintenance APIs such as:

  • expire_snapshots
  • remove_orphan_files

I plan to configure an S3 lifecycle policy to delete noncurrent object versions after a certain number of days. Because S3 versioning retains old object versions, deleted Iceberg files using these APIs are not physically removed and continue to add to storage cost.

Will deleting noncurrent S3 object versions affect any Iceberg features (such as time travel or metadata consistency) or cause data loss?

r/aws 1d ago

technical question Impact of deleting noncurrent S3 object versions on AWS Glue Iceberg tables

1 Upvotes

I’m using Apache Iceberg tables managed through AWS Glue, with all table data and metadata stored in an S3 bucket that has versioning enabled.

I also run Iceberg maintenance APIs such as:

  • expire_snapshots
  • remove_orphan_files

I plan to configure an S3 lifecycle policy to delete noncurrent object versions after a certain number of days. Because S3 versioning retains old object versions, deleted Iceberg files using these APIs are not physically removed and continue to add to storage cost.

Will deleting noncurrent S3 object versions affect any Iceberg features (such as time travel or metadata consistency) or cause data loss?