Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move databricks provider to new structure #46207

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions .github/boring-cyborg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,10 +132,7 @@ labelPRBasedOnFilePath:
- providers/standard/**

provider:databricks:
- providers/src/airflow/providers/databricks/**/*
- docs/apache-airflow-providers-databricks/**/*
- providers/tests/databricks/**/*
- providers/tests/system/databricks/**/*
- providers/databricks/**

provider:datadog:
- providers/datadog/**
Expand Down
4 changes: 4 additions & 0 deletions dev/moving_providers/move_providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -362,10 +362,13 @@ def move_provider_yaml(provider_id: str) -> tuple[list[str], list[str], list[str
dependencies = []
optional_dependencies = []
devel_dependencies = []
copied_logo = set()
Copy link
Contributor Author

@josix josix Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the FileNotFoundException that occurs when attempting to delete a moved logo in the integrations section, where multiple logos are present.

for line in original_content:
if line.startswith(" logo: "):
logo_path = line[len(" logo: ") :]
logo_name = logo_path.split("/")[-1]
if logo_path in copied_logo:
continue
new_logo_dir = (
PROVIDERS_DIR_PATH / _get_provider_only_path(provider_id) / "docs" / "integration-logos"
)
Expand All @@ -378,6 +381,7 @@ def move_provider_yaml(provider_id: str) -> tuple[list[str], list[str], list[str
remove_empty_parent_dir=True,
)
line = f" logo: /docs/integration-logos/{logo_name}"
copied_logo.add(logo_path)
if line == "dependencies:" and not in_dependencies:
in_dependencies = True
continue
Expand Down
1 change: 1 addition & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ apache-airflow-providers-cohere
apache-airflow-providers-common-compat
apache-airflow-providers-common-io
apache-airflow-providers-common-sql
apache-airflow-providers-databricks
apache-airflow-providers-datadog
apache-airflow-providers-dbt-cloud
apache-airflow-providers-discord
Expand Down
25 changes: 0 additions & 25 deletions docs/apache-airflow-providers-databricks/changelog.rst

This file was deleted.

87 changes: 87 additions & 0 deletions providers/databricks/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@

.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

.. NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!

.. IF YOU WANT TO MODIFY TEMPLATE FOR THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
`PROVIDER_README_TEMPLATE.rst.jinja2` IN the `dev/breeze/src/airflow_breeze/templates` DIRECTORY


Package ``apache-airflow-providers-databricks``

Release: ``7.0.0``


`Databricks <https://databricks.com/>`__


Provider package
----------------

This is a provider package for ``databricks`` provider. All classes for this provider package
are in ``airflow.providers.databricks`` python package.

You can find package information and changelog for the provider
in the `documentation <https://airflow.apache.org/docs/apache-airflow-providers-databricks/7.0.0/>`_.

Installation
------------

You can install this package on top of an existing Airflow 2 installation (see ``Requirements`` below
for the minimum Airflow version supported) via
``pip install apache-airflow-providers-databricks``

The package supports the following python versions: 3.9,3.10,3.11,3.12

Requirements
------------

======================================= ==================
PIP package Version required
======================================= ==================
``apache-airflow`` ``>=2.9.0``
``apache-airflow-providers-common-sql`` ``>=1.20.0``
``requests`` ``>=2.27.0,<3``
``databricks-sql-connector`` ``>=3.0.0``
``aiohttp`` ``>=3.9.2,<4``
``mergedeep`` ``>=1.3.4``
``pandas`` ``>=2.1.2,<2.2``
``pyarrow`` ``>=14.0.1``
======================================= ==================

Cross provider package dependencies
-----------------------------------

Those are dependencies that might be needed in order to use all the features of the package.
You need to install the specified provider packages in order to use them.

You can install such cross-provider dependencies when installing from PyPI. For example:

.. code-block:: bash

pip install apache-airflow-providers-databricks[common.sql]


============================================================================================================ ==============
Dependent package Extra
============================================================================================================ ==============
`apache-airflow-providers-common-sql <https://airflow.apache.org/docs/apache-airflow-providers-common-sql>`_ ``common.sql``
============================================================================================================ ==============

The changelog for the provider package can be found in the
`changelog <https://airflow.apache.org/docs/apache-airflow-providers-databricks/7.0.0/changelog.html>`_.
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ Features
Misc
~~~~

* ``Removed deprecated method referance airflow.www.auth.has_access when min airflow version >= 2.8.0 (#41747)``
* ``Removed deprecated method reference airflow.www.auth.has_access when min airflow version >= 2.8.0 (#41747)``
* ``remove deprecated soft_fail from providers (#41710)``

6.9.0
Expand Down Expand Up @@ -451,7 +451,7 @@ Misc
Features
~~~~~~~~

* ``Add "QUEUED" to RUN_LIFE_CYCLE_STATES following deployement of … (#33886)``
* ``Add "QUEUED" to RUN_LIFE_CYCLE_STATES following deployment of … (#33886)``
* ``allow DatabricksSubmitRunOperator to accept a pipeline name for a pipeline_task (#32903)``

Misc
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Importing CSV data

An example usage of the DatabricksCopyIntoOperator to import CSV data into a table is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_copy_into]
:end-before: [END howto_operator_databricks_copy_into]
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Specifying parameters as JSON

An example usage of the DatabricksCreateJobsOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_jobs_create_json]
:end-before: [END howto_operator_databricks_jobs_create_json]
Expand All @@ -77,7 +77,7 @@ Using named parameters

You can also use named parameters to initialize the operator and run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_jobs_create_named]
:end-before: [END howto_operator_databricks_jobs_create_named]
Expand All @@ -88,7 +88,7 @@ Pairing with DatabricksRunNowOperator
You can use the ``job_id`` that is returned by the DatabricksCreateJobsOperator in the
return_value XCom as an argument to the DatabricksRunNowOperator to run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_run_now]
:end-before: [END howto_operator_databricks_run_now]
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,14 @@ Examples

Running a notebook in Databricks on a new cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_notebook_new_cluster]
:end-before: [END howto_operator_databricks_notebook_new_cluster]

Running a notebook in Databricks on an existing cluster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_notebook_existing_cluster]
:end-before: [END howto_operator_databricks_notebook_existing_cluster]
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Create a Databricks Repo

An example usage of the DatabricksReposCreateOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_create]
:end-before: [END howto_operator_databricks_repo_create]
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Deleting Databricks Repo by specifying path

An example usage of the DatabricksReposDeleteOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_delete]
:end-before: [END howto_operator_databricks_repo_delete]
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Updating Databricks Repo by specifying path

An example usage of the DatabricksReposUpdateOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_repos.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_repos.py
:language: python
:start-after: [START howto_operator_databricks_repo_update]
:end-before: [END howto_operator_databricks_repo_update]
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Selecting data

An example usage of the DatabricksSqlOperator to select data from a table is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_select]
:end-before: [END howto_operator_databricks_sql_select]
Expand All @@ -59,7 +59,7 @@ Selecting data into a file

An example usage of the DatabricksSqlOperator to select data from a table and store in a file is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_select_file]
:end-before: [END howto_operator_databricks_sql_select_file]
Expand All @@ -69,7 +69,7 @@ Executing multiple statements

An example usage of the DatabricksSqlOperator to perform multiple SQL statements is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_multiple]
:end-before: [END howto_operator_databricks_sql_multiple]
Expand All @@ -80,7 +80,7 @@ Executing multiple statements from a file

An example usage of the DatabricksSqlOperator to perform statements from a file is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sql.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sql.py
:language: python
:start-after: [START howto_operator_databricks_sql_multiple_file]
:end-before: [END howto_operator_databricks_sql_multiple_file]
Expand All @@ -107,15 +107,15 @@ Examples
--------
Configuring Databricks connection to be used with the Sensor.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_connection_setup]
:end-before: [END howto_sensor_databricks_connection_setup]

Poking the specific table with the SQL statement:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_sql]
Expand Down Expand Up @@ -154,15 +154,15 @@ Examples
--------
Configuring Databricks connection to be used with the Sensor.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_connection_setup]
:end-before: [END howto_sensor_databricks_connection_setup]

Poking the specific table for existence of data/partition:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_sensors.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_sensors.py
:language: python
:dedent: 4
:start-after: [START howto_sensor_databricks_partition]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ Specifying parameters as JSON

An example usage of the DatabricksSubmitRunOperator is as follows:

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_json]
:end-before: [END howto_operator_databricks_json]
Expand All @@ -124,7 +124,7 @@ Using named parameters

You can also use named parameters to initialize the operator and run the job.

.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_named]
:end-before: [END howto_operator_databricks_named]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,21 +33,21 @@ Examples

Running a notebook in Databricks using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_notebook]
:end-before: [END howto_operator_databricks_task_notebook]

Running a SQL query in Databricks using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_sql]
:end-before: [END howto_operator_databricks_task_sql]

Running a python file in Databricks in using DatabricksTaskOperator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks.py
:language: python
:start-after: [START howto_operator_databricks_task_python]
:end-before: [END howto_operator_databricks_task_python]
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Examples

Example of what a DAG looks like with a DatabricksWorkflowTaskGroup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. exampleinclude:: /../../providers/tests/system/databricks/example_databricks_workflow.py
.. exampleinclude:: /../../providers/databricks/tests/system/databricks/example_databricks_workflow.py
:language: python
:start-after: [START howto_databricks_workflow_notebook]
:end-before: [END howto_databricks_workflow_notebook]
Expand Down
Loading
Loading