Databricks SQL – Delta Commands

SELECT on Delta Tables

So far, the SQL commands we have used are generic to most flavors of SQL. In the next few queries, we are going to look at commands that are specific to using SELECT on Delta tables.

Delta tables keep a log of changes that we can view by running the command below.

After running DESCRIBE HISTORY, we can see that we are on version number 0 and we can see a timestamp of when this change was made.

USE demo;
DESCRIBE HISTORY customers;

SELECT on Delta Tables — Updating the Table

We are going to make a change to the table.

The code uses an UPDATE statement to make a change to the table. We will be discussing UPDATE later on. For now, we just need to understand that a change was made to the table. We also reran our DESCRIBE HISTORY command, and note that we have a new version in the log, with a new timestamp.

USE demo;
UPDATE customers SET loyalty_segment = 10 WHERE loyalty_segment = 0;
DESCRIBE HISTORY customers;

SELECT on Delta Tables — VERSION AS OF

We can now use a special predicate for use with Delta tables: VERSION AS OF

By using VERSION AS OF, we can SELECT from specific versions of the table. This feature of Delta tables is called “Time Travel,” and is very powerful.

We can also use TIMESTAMP AS OF to SELECT based on a table’s state on a specific date, and you can find more information in the documentation.

USE demo;
SELECT loyalty_segment FROM customers VERSION AS OF 1;

MERGE INTO

Certainly, there are times when we want to insert new data but ensure we don’t re-insert matched data. This is where we use MERGE INTO. MERGE INTO will merge two tables together, but you specify in which column to look for matched data and what to do when a match is found. Let’s run the code and examine the command in more detail.

USE demo;
MERGE INTO suppliers
    USING source_suppliers
    ON suppliers.SUPPLIER_ID = source_suppliers.SUPPLIER_ID
    WHEN NOT MATCHED THEN INSERT *;
SELECT count(*) FROM suppliers;