CLI
Install the CLI
pip install databricks-cli
pip install databricks-cli --upgrade
Set up authentication
databricks configure --token
After you complete the prompts, your access credentials are stored in the file ~/.databrickscfg
on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg
on Windows. The file contains a default profile entry:
[DEFAULT]
host = <workspace-URL>
token = <personal-access-token>
$ databricks --help Usage: databricks [OPTIONS] COMMAND [ARGS]... Options: -v, --version x.xx.x --debug Debug Mode. Shows full stack trace on error. --profile TEXT CLI connection profile to use. The default profile is "DEFAULT". -h, --help Show this message and exit. Commands: clusters Utility to interact with Databricks clusters. configure Configures host and authentication info for the CLI. fs Utility to interact with DBFS. groups Utility to interact with Databricks groups. instance-pools Utility to interact with Databricks instance pools. jobs Utility to interact with jobs. libraries Utility to interact with libraries. pipelines Utility to interact with the Databricks Delta Pipelines. runs Utility to interact with the jobs runs. secrets Utility to interact with Databricks secret API. stack [Beta] Utility to deploy and download Databricks resource stacks. workspace Utility to interact with the Databricks workspace.
databricks jobs list --profile test
databricks jobs list --profile test--output JSON | jq '.jobs[] | select(.job_id == 123) | .settings'
databricks clusters list --profile test
databricks clusters list --profile test --output JSON | jq '[ .clusters[] | { name: .cluster_name, id: .cluster_id } ]'
JSON string parameters
databricks jobs run-now --job-id 9 --jar-params '["20180505", "alantest"]'
Runs CLI
Requirements to call the Jobs REST API 2.0
Update the CLI to version 0.16.0 or above
Run the command
databricks jobs configure --version=2.0
This adds the setting jobs-api-version = 2.0
to the file ~/.databrickscfg
on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg
on Windows. All job runs CLI (and jobs CLI) subcommands will call the Jobs REST API 2.0 by default.
Subcommands and general usage
$ databricks runs --help Usage: databricks runs [OPTIONS] COMMAND [ARGS]... Utility to interact with jobs runs. Options: -v, --version 0.11.0 --debug Debug Mode. Shows full stack trace on error. --profile TEXT CLI connection profile to use. The default profile is "DEFAULT". -h, --help Show this message and exit. Commands: cancel Cancels the run specified. get Gets the metadata about a run in json form. get-output Gets the output of a run The output schema is documented... list Lists job runs. submit Submits a one-time run.
Get the output of a run
databricks runs get-output --run-id 119
{ "metadata": { "job_id": 239, "run_id": 119, "number_in_job": 1, "original_attempt_run_id": 119, "state": { "life_cycle_state": "TERMINATED", "result_state": "SUCCESS", "state_message": "" }, "task": { "notebook_task": { "notebook_path": "/Users/someone@example.com/notebooks/my-notebook.ipynb" } }, "cluster_spec": { "new_cluster": { "spark_version": "8.1.x-scala2.12", "aws_attributes": { "zone_id": "us-west-2c", "availability": "SPOT_WITH_FALLBACK" }, "node_type_id": "m5d.large", "enable_elastic_disk": false, "num_workers": 1 } }, "cluster_instance": { "cluster_id": "1234-567890-abcd123", "spark_context_id": "1234567890123456789" }, "start_time": 1618510327335, "setup_duration": 191000, "execution_duration": 41000, "cleanup_duration": 2000, "end_time": 1618510561615, "trigger": "ONE_TIME", "creator_user_name": "someone@example.com", "run_name": "my-notebook-run", "run_page_url": "https://dbc-a1b2345c-d6e7.cloud.databricks.com/?o=1234567890123456#job/239/run/1", "run_type": "JOB_RUN", "attempt_number": 0 }, "notebook_output": {} }
โดย notebook_output
จะได้ค่ามาจาก dbutils.notebook.exit()
ใน notebook Jobs API 2.0 | Databricks on AWS