- Read data stored in tables and views – Learn | Microsoft Docs
- CREATE VIEW | Databricks on AWS
- SELECT | Databricks on AWS (named_expression)
- Databricks Runtime expression | Databricks on AWS (expression)
- Create views | Databricks on AWS (Data masking)
- LATERAL VIEW clause (Databricks SQL) | Databricks on AWS (LATERAL)
- Databases and tables
- Views
- Reading from a Table/View
- Temporary Views
1. Databases and tables
An Azure Databricks database (schema) is a collection of tables. An Azure Databricks table is a collection of structured data. You can cache, filter, and perform any operations supported by Apache Spark DataFrames on Azure Databricks tables. You can query tables with Spark APIs and Spark SQL.
2. Views
Constructs a virtual table that has no physical data based on the result-set of a SQL query. ALTER VIEW
and DROP VIEW
only change metadata.
3. Reading from a Table/View
tempDF = spark.read.table("jack_db.jack_table1")
tempDF.printSchema()
display(tempDF)
%sql SELECT * FROM jack_db.jack_table1 LIMIT(5)
Review: Reading from Tables
- No job is executed – the schema is stored in the table definition on Databricks.
- The file is stored on the DBFS. If we used JDBC, it would open the connection to the database and read it in. If we used an object store (like what is backing the DBFS), it would read the data from source.
4. Temporary Views
We can create a [temporary] view with createOrReplaceTempView()
# create a temporary view from the resulting DataFrame tempDF.createOrReplaceTempView("jack_view1")
%sql SELECT * FROM jack_view1 ORDER BY user_id DESC LIMIT (5)
** Note #1: ** The method createOrReplaceTempView(..)
is bound to the SparkSession meaning it will be discarded once the session ends.
** Note #2: ** On the other hand, the method createOrReplaceGlobalTempView(..)
is bound to the spark application.*
Or to put that another way, I can use createOrReplaceTempView(..)
in this notebook only. However, I can call createOrReplaceGlobalTempView(..)
in this notebook and then access it from another.