subcategory |
---|
Compute |
Use databricks_pipeline
to deploy Delta Live Tables.
resource "databricks_notebook" "dlt_demo" {
#...
}
resource "databricks_pipeline" "this" {
name = "Pipeline Name"
storage = "/test/first-pipeline"
configuration = {
key1 = "value1"
key2 = "value2"
}
cluster {
label = "default"
num_workers = 2
custom_tags = {
cluster_type = "default"
}
}
cluster {
label = "maintenance"
num_workers = 1
custom_tags = {
cluster_type = "maintenance"
}
}
library {
notebook {
path = databricks_notebook.dlt_demo.id
}
}
continuous = false
}
The following arguments are supported:
name
- A user-friendly name for this pipeline. The name can be used to identify pipeline jobs in the UI.storage
- A location on DBFS or cloud storage where output data and metadata required for pipeline execution are stored. By default, tables are stored in a subdirectory of this location. Change of this parameter forces recreation of the pipeline.configuration
- An optional list of values to apply to the entire pipeline. Elements must be formatted as key:value pairs.library
blocks - Specifies pipeline code and required artifacts. Syntax resembles library configuration block with the addition of a specialnotebook
type of library that should have thepath
attribute. Right now only thenotebook
type is supported.cluster
blocks - Clusters to run the pipeline. If none is specified, pipelines will automatically select a default cluster configuration for the pipeline. Please note that DLT pipeline clusters are supporting only subset of attributes as described in documentation.continuous
- A flag indicating whether to run the pipeline continuously. The default value isfalse
.development
- A flag indicating whether to run the pipeline in development mode. The default value isfalse
.photon
- A flag indicating whether to use Photon engine. The default value isfalse
.target
- The name of a database for persisting pipeline output data. Configuring the target setting allows you to view and query the pipeline output data from the Databricks UI.edition
- optional name of the product edition. Supported values are:core
,pro
,advanced
(default).channel
- optional name of the release channel for Spark version used by DLT pipeline. Supported values are:current
(default) andpreview
.
The resource job can be imported using the id of the pipeline
$ terraform import databricks_pipeline.this <pipeline-id>
The following resources are often used in the same context:
- End to end workspace management guide.
- databricks_cluster to create Databricks Clusters.
- databricks_job to manage Databricks Jobs to run non-interactive code in a databricks_cluster.
- databricks_notebook to manage Databricks Notebooks.