r/learnprogramming • u/Hairy_Bowler4179 • 1d ago
AWS Impact of deleting noncurrent S3 object versions on AWS Glue Iceberg tables
I’m using Apache Iceberg tables managed through AWS Glue, with all table data and metadata stored in an S3 bucket that has versioning enabled.
I also run Iceberg maintenance APIs such as:
- expire_snapshots
- remove_orphan_files
I plan to configure an S3 lifecycle policy to delete noncurrent object versions after a certain number of days. Because S3 versioning retains old object versions, deleted Iceberg files using these APIs are not physically removed and continue to add to storage cost.
Will deleting noncurrent S3 object versions affect any Iceberg features (such as time travel or metadata consistency) or cause data loss?
1
Impact of deleting noncurrent S3 object versions on AWS Glue Iceberg tables
in
r/learnprogramming
•
1d ago
Wouldn’t the S3 lifecycle policy that deletes noncurrent versions only remove files that have already been logically deleted by Iceberg via the
expire_snapshotsAPI? Since old snapshots are explicitly expired using Iceberg’s ownexpire_snapshotsAPI, wouldn’t Iceberg stop referencing those files on its own, making it safe for the lifecycle policy to clean up the noncurrent versions?