Optimizes the layout of Delta Lake data. Optionally optimize a subset of data or colocate data by column. If you do not specify colocation, bin-packing optimization is performed.
Syntax
OPTIMIZE table_name [WHERE predicate] [ZORDER BY (col_name1 [, ...] ) ]
Parameters
- table_name – Identifies an existing Delta table. The name must not include a temporal specification.
WHERE
– Optimize the subset of rows matching the given partition predicate. Only filters involving partition key attributes are supported.ZORDER BY
– Colocate column information in the same set of files. Co-locality is used by Delta Lake data-skipping algorithms to dramatically reduce the amount of data that needs to be read. You can specify multiple columns forZORDER BY
as a comma-separated list. However, the effectiveness of the locality drops with each additional column.
Examples
OPTIMIZE delta.`/data/events` OPTIMIZE events OPTIMIZE events WHERE date >= '2022-11-18' OPTIMIZE events WHERE date >= current_timestamp() - INTERVAL 1 day ZORDER BY (eventType)