Skip to content

Commit

Permalink
Move databricks provider to new structure (apache#46207)
Browse files Browse the repository at this point in the history
* refactor(providers/databricks): move databricks provider to new structure

* remove unused caplog

---------

Co-authored-by: Jarek Potiuk <[email protected]>
  • Loading branch information
2 people authored and niklasr22 committed Feb 8, 2025
1 parent 61f87ca commit 42939f6
Show file tree
Hide file tree
Showing 89 changed files with 677 additions and 92 deletions.
5 changes: 1 addition & 4 deletions .github/boring-cyborg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,7 @@ labelPRBasedOnFilePath:
- providers/standard/**

provider:databricks:
- providers/src/airflow/providers/databricks/**/*
- docs/apache-airflow-providers-databricks/**/*
- providers/tests/databricks/**/*
- providers/tests/system/databricks/**/*
- providers/databricks/**

provider:datadog:
- providers/datadog/**
Expand Down
3 changes: 3 additions & 0 deletions dev/moving_providers/move_providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,8 @@ def move_provider_yaml(provider_id: str) -> tuple[list[str], list[str], list[str
if line.startswith(" logo: "):
logo_path = line[len(" logo: ") :]
logo_name = logo_path.split("/")[-1]
if logo_path in already_moved_logos:
continue
new_logo_dir = (
PROVIDERS_DIR_PATH / _get_provider_only_path(provider_id) / "docs" / "integration-logos"
)
Expand All @@ -386,6 +388,7 @@ def move_provider_yaml(provider_id: str) -> tuple[list[str], list[str], list[str
remove_empty_parent_dir=True,
)
line = f" logo: /docs/integration-logos/{logo_name}"
already_moved_logos.add(logo_path)
if line == "dependencies:" and not in_dependencies:
in_dependencies = True
continue
Expand Down
1 change: 1 addition & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ apache-airflow-providers-cohere
apache-airflow-providers-common-compat
apache-airflow-providers-common-io
apache-airflow-providers-common-sql
apache-airflow-providers-databricks
apache-airflow-providers-datadog
apache-airflow-providers-dbt-cloud
apache-airflow-providers-dingding
Expand Down
25 changes: 0 additions & 25 deletions docs/apache-airflow-providers-databricks/changelog.rst

This file was deleted.

87 changes: 87 additions & 0 deletions providers/databricks/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@

.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
.. NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!
.. IF YOU WANT TO MODIFY TEMPLATE FOR THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
`PROVIDER_README_TEMPLATE.rst.jinja2` IN the `dev/breeze/src/airflow_breeze/templates` DIRECTORY
Package ``apache-airflow-providers-databricks``

Release: ``7.0.0``


`Databricks <https://databricks.com/>`__


Provider package
----------------

This is a provider package for ``databricks`` provider. All classes for this provider package
are in ``airflow.providers.databricks`` python package.

You can find package information and changelog for the provider
in the `documentation <https://airflow.apache.org/docs/apache-airflow-providers-databricks/7.0.0/>`_.

Installation
------------

You can install this package on top of an existing Airflow 2 installation (see ``Requirements`` below
for the minimum Airflow version supported) via
``pip install apache-airflow-providers-databricks``

The package supports the following python versions: 3.9,3.10,3.11,3.12

Requirements
------------

======================================= ==================
PIP package Version required
======================================= ==================
``apache-airflow`` ``>=2.9.0``
``apache-airflow-providers-common-sql`` ``>=1.20.0``
``requests`` ``>=2.27.0,<3``
``databricks-sql-connector`` ``>=3.0.0``
``aiohttp`` ``>=3.9.2,<4``
``mergedeep`` ``>=1.3.4``
``pandas`` ``>=2.1.2,<2.2``
``pyarrow`` ``>=14.0.1``
======================================= ==================

Cross provider package dependencies
-----------------------------------

Those are dependencies that might be needed in order to use all the features of the package.
You need to install the specified provider packages in order to use them.

You can install such cross-provider dependencies when installing from PyPI. For example:

.. code-block:: bash
pip install apache-airflow-providers-databricks[common.sql]
============================================================================================================ ==============
Dependent package Extra
============================================================================================================ ==============
`apache-airflow-providers-common-sql <https://airflow.apache.org/docs/apache-airflow-providers-common-sql>`_ ``common.sql``
============================================================================================================ ==============

The changelog for the provider package can be found in the
`changelog <https://airflow.apache.org/docs/apache-airflow-providers-databricks/7.0.0/changelog.html>`_.
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ Features
Misc
~~~~

* ``Removed deprecated method referance airflow.www.auth.has_access when min airflow version >= 2.8.0 (#41747)``
* ``Removed deprecated method reference airflow.www.auth.has_access when min airflow version >= 2.8.0 (#41747)``
* ``remove deprecated soft_fail from providers (#41710)``

6.9.0
Expand Down Expand Up @@ -451,7 +451,7 @@ Misc
Features
~~~~~~~~

* ``Add "QUEUED" to RUN_LIFE_CYCLE_STATES following deployement of … (#33886)``
* ``Add "QUEUED" to RUN_LIFE_CYCLE_STATES following deployment of … (#33886)``
* ``allow DatabricksSubmitRunOperator to accept a pipeline name for a pipeline_task (#32903)``

Misc
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Importing CSV data

An example usage of the DatabricksCopyIntoOperator to import CSV data into a table is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_copy_into]
:end-before: [END howto_operator_databricks_copy_into]
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Specifying parameters as JSON

An example usage of the DatabricksCreateJobsOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_jobs_create_json]
:end-before: [END howto_operator_databricks_jobs_create_json]
Expand All @@ -77,7 +77,7 @@ Using named parameters

You can also use named parameters to initialize the operator and run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_jobs_create_named]
:end-before: [END howto_operator_databricks_jobs_create_named]
Expand All @@ -88,7 +88,7 @@ Pairing with DatabricksRunNowOperator
You can use the ``job_id`` that is returned by the DatabricksCreateJobsOperator in the
return_value XCom as an argument to the DatabricksRunNowOperator to run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_run_now]
:end-before: [END howto_operator_databricks_run_now]
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,14 @@ Examples

Running a notebook in Databricks on a new cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_notebook_new_cluster]
:end-before: [END howto_operator_databricks_notebook_new_cluster]

Running a notebook in Databricks on an existing cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_notebook_existing_cluster]
:end-before: [END howto_operator_databricks_notebook_existing_cluster]
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Create a Databricks Repo

An example usage of the DatabricksReposCreateOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_create]
:end-before: [END howto_operator_databricks_repo_create]
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Deleting Databricks Repo by specifying path

An example usage of the DatabricksReposDeleteOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_delete]
:end-before: [END howto_operator_databricks_repo_delete]
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Updating Databricks Repo by specifying path

An example usage of the DatabricksReposUpdateOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_update]
:end-before: [END howto_operator_databricks_repo_update]
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Selecting data

An example usage of the DatabricksSqlOperator to select data from a table is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_select]
:end-before: [END howto_operator_databricks_sql_select]
Expand All @@ -59,7 +59,7 @@ Selecting data into a file

An example usage of the DatabricksSqlOperator to select data from a table and store in a file is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_select_file]
:end-before: [END howto_operator_databricks_sql_select_file]
Expand All @@ -69,7 +69,7 @@ Executing multiple statements

An example usage of the DatabricksSqlOperator to perform multiple SQL statements is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_multiple]
:end-before: [END howto_operator_databricks_sql_multiple]
Expand All @@ -80,7 +80,7 @@ Executing multiple statements from a file

An example usage of the DatabricksSqlOperator to perform statements from a file is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_multiple_file]
:end-before: [END howto_operator_databricks_sql_multiple_file]
Expand All @@ -107,15 +107,15 @@ Examples
--------
Configuring Databricks connection to be used with the Sensor.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_connection_setup]
:end-before: [END howto_sensor_databricks_connection_setup]

Poking the specific table with the SQL statement:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_sql]
Expand Down Expand Up @@ -154,15 +154,15 @@ Examples
--------
Configuring Databricks connection to be used with the Sensor.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_connection_setup]
:end-before: [END howto_sensor_databricks_connection_setup]

Poking the specific table for existence of data/partition:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_partition]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ Specifying parameters as JSON

An example usage of the DatabricksSubmitRunOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_json]
:end-before: [END howto_operator_databricks_json]
Expand All @@ -124,7 +124,7 @@ Using named parameters

You can also use named parameters to initialize the operator and run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_named]
:end-before: [END howto_operator_databricks_named]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,21 +33,21 @@ Examples

Running a notebook in Databricks using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_notebook]
:end-before: [END howto_operator_databricks_task_notebook]

Running a SQL query in Databricks using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_sql]
:end-before: [END howto_operator_databricks_task_sql]

Running a python file in Databricks in using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_python]
:end-before: [END howto_operator_databricks_task_python]
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Examples

Example of what a DAG looks like with a DatabricksWorkflowTaskGroup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_workflow.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_workflow.py
:language: python
:start-after: [START howto_databricks_workflow_notebook]
:end-before: [END howto_databricks_workflow_notebook]
Expand Down
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 42939f6

Please sign in to comment.