Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dev/breeze/tests/test_selective_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -2316,7 +2316,7 @@ def test_expected_output_push(
),
{
"selected-providers-list-as-string": "amazon common.compat common.io common.sql "
"databricks dbt.cloud ftp google microsoft.mssql mysql "
"databricks dbt.cloud ftp google microsoft.azure microsoft.mssql mysql "
"openlineage oracle postgres sftp snowflake standard trino",
"all-python-versions": f"['{DEFAULT_PYTHON_MAJOR_MINOR_VERSION}']",
"all-python-versions-list-as-string": DEFAULT_PYTHON_MAJOR_MINOR_VERSION,
Expand All @@ -2335,7 +2335,7 @@ def test_expected_output_push(
{
"description": "amazon...standard",
"test_types": "Providers[amazon] Providers[common.compat,common.io,common.sql,"
"databricks,dbt.cloud,ftp,microsoft.mssql,mysql,openlineage,oracle,"
"databricks,dbt.cloud,ftp,microsoft.azure,microsoft.mssql,mysql,openlineage,oracle,"
"postgres,sftp,snowflake,trino] Providers[google] Providers[standard]",
}
]
Expand Down
2 changes: 2 additions & 0 deletions providers/microsoft/azure/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,8 @@ Dependent package
`apache-airflow-providers-amazon <https://airflow.apache.org/docs/apache-airflow-providers-amazon>`_ ``amazon``
`apache-airflow-providers-common-compat <https://airflow.apache.org/docs/apache-airflow-providers-common-compat>`_ ``common.compat``
`apache-airflow-providers-common-messaging <https://airflow.apache.org/docs/apache-airflow-providers-common-messaging>`_ ``common.messaging``
`apache-airflow-providers-google <https://airflow.apache.org/docs/apache-airflow-providers-google>`_ ``google``
`apache-airflow-providers-openlineage <https://airflow.apache.org/docs/apache-airflow-providers-openlineage>`_ ``openlineage``
`apache-airflow-providers-oracle <https://airflow.apache.org/docs/apache-airflow-providers-oracle>`_ ``oracle``
`apache-airflow-providers-sftp <https://airflow.apache.org/docs/apache-airflow-providers-sftp>`_ ``sftp``
======================================================================================================================== ====================
Expand Down
69 changes: 69 additions & 0 deletions providers/microsoft/azure/docs/transfer/gcs_to_wasb.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@

.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

=====================================================
Google Cloud Storage to Azure Blob Storage transfer
=====================================================

`Google Cloud Storage <https://cloud.google.com/storage/>`__ and
`Azure Blob Storage <https://learn.microsoft.com/en-us/azure/storage/blobs/>`__
are object stores commonly used for data lakes and file exchange.
This guide describes copying objects from GCS into an Azure Blob container.

Install the optional dependency when using this operator:

.. code-block:: bash

pip install 'apache-airflow-providers-microsoft-azure[google]'

Prerequisite Tasks
------------------

.. include:: ../operators/_partials/prerequisite_tasks.rst

.. _howto/operator:GCSToAzureBlobStorageOperator:

Operator
--------

Use :class:`~airflow.providers.microsoft.azure.transfers.gcs_to_wasb.GCSToAzureBlobStorageOperator`
to list objects under a GCS ``prefix`` and upload them to a container using ``blob_prefix`` as the base path.
Use ``keep_directory_structure`` and ``flatten_structure`` the same way as
:class:`~airflow.providers.amazon.aws.transfers.gcs_to_s3.GCSToS3Operator` (``flatten_structure`` wins when both apply).
Object keys ending with ``/`` (GCS console folder markers) are not copied.

Example:

.. code-block:: python

copy_gcs_to_azure = GCSToAzureBlobStorageOperator(
task_id="gcs_to_azure_blob",
gcs_bucket="my-gcs-bucket",
prefix="exports/daily/",
container_name="my-container",
blob_prefix="imports/daily",
gcp_conn_id="google_cloud_default",
wasb_conn_id="wasb_default",
replace=True,
)

Reference
---------

* `Google Cloud Storage Python client <https://cloud.google.com/python/docs/reference/storage/latest>`__
* `Azure Blob Storage client library <https://learn.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python>`__
4 changes: 4 additions & 0 deletions providers/microsoft/azure/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,10 @@ transfers:
target-integration-name: Microsoft Azure Blob Storage
how-to-guide: /docs/apache-airflow-providers-microsoft-azure/transfer/s3_to_wasb.rst
python-module: airflow.providers.microsoft.azure.transfers.s3_to_wasb
- source-integration-name: Google Cloud Storage (GCS)
target-integration-name: Microsoft Azure Blob Storage
how-to-guide: /docs/apache-airflow-providers-microsoft-azure/transfer/gcs_to_wasb.rst
python-module: airflow.providers.microsoft.azure.transfers.gcs_to_wasb


connection-types:
Expand Down
8 changes: 8 additions & 0 deletions providers/microsoft/azure/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,12 @@ dependencies = [
"common.messaging" = [
"apache-airflow-providers-common-messaging>=2.0.0"
]
"google" = [
"apache-airflow-providers-google"
]
"openlineage" = [
"apache-airflow-providers-openlineage>=2.3.0"
]

[dependency-groups]
dev = [
Expand All @@ -123,6 +129,8 @@ dev = [
"apache-airflow-providers-amazon",
"apache-airflow-providers-common-compat",
"apache-airflow-providers-common-messaging",
"apache-airflow-providers-google",
"apache-airflow-providers-openlineage",
"apache-airflow-providers-oracle",
"apache-airflow-providers-sftp",
# Additional devel dependencies (do not remove this line and add extra development dependencies)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,12 @@ def get_provider_info():
"how-to-guide": "/docs/apache-airflow-providers-microsoft-azure/transfer/s3_to_wasb.rst",
"python-module": "airflow.providers.microsoft.azure.transfers.s3_to_wasb",
},
{
"source-integration-name": "Google Cloud Storage (GCS)",
"target-integration-name": "Microsoft Azure Blob Storage",
"how-to-guide": "/docs/apache-airflow-providers-microsoft-azure/transfer/gcs_to_wasb.rst",
"python-module": "airflow.providers.microsoft.azure.transfers.gcs_to_wasb",
},
],
"connection-types": [
{
Expand Down
Loading
Loading