VACUUM table_name [RETAIN num HOURS] [DRY RUN]
Parameters
- table_name – Identifies an existing Delta table. The name must not include a temporal specification.
- RETAIN num HOURS – The retention threshold.
- DRY RUN – Return a list of up to 1000 files to be deleted.
แสดงรายชื่อไฟล์ที่จะถูกลบ
VACUUM table_name RETAIN 720 HOURS DRY RUN
ทำการลบไฟล์
VACUUM table_name RETAIN 720 HOURS
1 day | 24 HOURS |
1 week | 168 HOURS |
2 week | 336 HOURS |
30 days | 720 HOURS |
Configure data retention for time travel
To time travel to a previous version, you must retain both the log and the data files for that version.
The data files backing a Delta table are never deleted automatically; data files are deleted only when you run VACUUM. VACUUM
does not delete Delta log files; log files are automatically cleaned up after checkpoints are written.
By default you can time travel to a Delta table up to 30 days old unless you have:
- Run
VACUUM
on your Delta table. - Changed the data or log file retention periods using the following table properties:
delta.logRetentionDuration = "interval <interval>"
: controls how long the history for a table is kept. The default isinterval 30 days
.Each time a checkpoint is written, Databricks automatically cleans up log entries older than the retention interval. If you set this config to a large enough value, many log entries are retained. This should not impact performance as operations against the log are constant time. Operations on history are parallel but will become more expensive as the log size increases.delta.deletedFileRetentionDuration = "interval <interval>"
: controls how long ago a file must have been deleted before being a candidate forVACUUM
. The default isinterval 7 days
.To access 30 days of historical data even if you runVACUUM
on the Delta table, setdelta.deletedFileRetentionDuration = "interval 30 days"
. This setting may cause your storage costs to go up.
ถ้ารัน VACUUM
เลย data files ที่เกิน 7 วันจะถูกลบ
ถ้าจะเก็บ data files ให้มากกว่า 7 วัน โดยไม่ต้องมาคอยกำหนดค่า RETAIN num HOURS
ให้ไป SET delta.deletedFileRetentionDuration
ก่อน แล้วค่อยรัน VACUUM
SET and UNSET TBLPROPERTIES
ALTER TABLE table_name { RENAME TO clause | ADD COLUMN clause | ALTER COLUMN clause | DROP COLUMN clause | RENAME COLUMN clause | ADD CONSTRAINT clause | DROP CONSTRAINT clause | ADD PARTITION clause | DROP PARTITION clause | RENAME PARTITION clause | RECOVER PARTITIONS clause | SET TBLPROPERTIES clause | UNSET TBLPROPERTIES clause | SET SERDE clause | SET LOCATION clause | SET OWNER TO clause }
Example
ALTER TABLE dbx.tab1 SET TBLPROPERTIES ('delta.logRetentionDuration' = 'interval 30 days'); ALTER TABLE dbx.tab1 SET TBLPROPERTIES ('delta.deletedFileRetentionDuration' = 'interval 30 days'); ALTER TABLE dbx.tab1 UNSET TBLPROPERTIES ('delta.deletedFileRetentionDuration');