Apache Iceberg Snapshots and Time Travel
Every time you write to an Apache Iceberg table, the write produces a new snapshot: a complete, immutable record of the table's state at that moment. The previous snapshot stays in the metadata unchanged. This single design choice gives Iceberg time travel, rollback, reproducible ML training runs, and safe concurrent reads as direct results.
How Snapshots Chain Together
Querying a Past Snapshot
-- Spark: query by snapshot ID
SELECT * FROM analytics.orders VERSION AS OF 8027658604211071520;
-- Spark: query by timestamp
SELECT * FROM analytics.orders TIMESTAMP AS OF '2026-03-10 08:00:00';
-- Trino
SELECT * FROM iceberg.analytics.orders
FOR VERSION AS OF 8027658604211071520; # PyIceberg
table = catalog.load_table("analytics.orders")
scan = table.scan(snapshot_id=8027658604211071520) Rolling Back
-- Roll back to a specific snapshot
CALL system.rollback_to_snapshot('analytics.orders', 8027658604211071520);
-- Roll back to a timestamp
CALL system.rollback_to_timestamp('analytics.orders',
TIMESTAMP '2026-03-09 23:59:00'); After a rollback, newer snapshots remain in metadata until explicitly expired. You can recover from a bad rollback by rolling forward again.
Branches and Tags
Named snapshot references let you point to specific snapshots with human-readable names. Branches are mutable (they advance with new commits). Tags are immutable (permanent markers for specific points in history).
-- Create a staging branch and write to it
ALTER TABLE analytics.orders CREATE BRANCH staging;
INSERT INTO analytics.orders.branch_staging SELECT * FROM new_orders;
-- Validate, then fast-forward main
CALL system.fast_forward('analytics.orders', 'main', 'staging');
-- Tag a snapshot for ML reproducibility
ALTER TABLE analytics.user_features
CREATE TAG ml_training_2026_q1
AS OF VERSION 8027658604211071520; Write-Audit-Publish Pattern
Snapshot Retention
Old snapshots keep references to data files alive, preventing garbage collection. Expire snapshots regularly to control storage costs, and use named tags to preserve specific snapshots beyond the general retention window.
CALL system.expire_snapshots(
table => 'analytics.orders',
older_than => TIMESTAMP '2026-05-08 00:00:00',
retain_last => 5
);