-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(databricks): adds Azure oauth to Databricks #15117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5321f48
3f4bf5f
30445ea
536b9ab
dac502c
162cd0e
858f5ed
7611608
aad4b29
37107c5
672c054
5e6b452
3d9a1bd
0eeee06
d80420f
043e4a5
870c7ff
84d31de
74f4769
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| from pydantic import Field, SecretStr | ||
|
|
||
| from datahub.configuration import ConfigModel | ||
|
|
||
|
|
||
| class AzureAuthConfig(ConfigModel): | ||
| client_secret: SecretStr = Field( | ||
| description="Azure application client secret used for authentication. This is a confidential credential that should be kept secure." | ||
| ) | ||
| client_id: str = Field( | ||
| description="Azure application (client) ID. This is the unique identifier for the registered Azure AD application.", | ||
| ) | ||
| tenant_id: str = Field( | ||
| description="Azure tenant (directory) ID. This identifies the Azure AD tenant where the application is registered.", | ||
| ) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,6 +44,7 @@ | |
| from datahub._version import nice_version_name | ||
| from datahub.api.entities.external.unity_catalog_external_entites import UnityCatalogTag | ||
| from datahub.emitter.mce_builder import parse_ts_millis | ||
| from datahub.ingestion.source.unity.azure_auth_config import AzureAuthConfig | ||
| from datahub.ingestion.source.unity.config import ( | ||
| LineageDataSource, | ||
| UsageDataSource, | ||
|
|
@@ -169,20 +170,31 @@ class UnityCatalogApiProxy(UnityCatalogProxyProfilingMixin): | |
| def __init__( | ||
| self, | ||
| workspace_url: str, | ||
| personal_access_token: str, | ||
| warehouse_id: Optional[str], | ||
| report: UnityCatalogReport, | ||
| hive_metastore_proxy: Optional[HiveMetastoreProxy] = None, | ||
| lineage_data_source: LineageDataSource = LineageDataSource.AUTO, | ||
| usage_data_source: UsageDataSource = UsageDataSource.AUTO, | ||
| databricks_api_page_size: int = 0, | ||
| personal_access_token: Optional[str] = None, | ||
| azure_auth: Optional[AzureAuthConfig] = None, | ||
| ): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. validation required -- if azure_auth is provided, all three fields (client_id, client_secret, tenant_id) must be present
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The fields defined in metadata-ingestion/src/datahub/ingestion/source/unity/azure_auth_config.py are neither Optional nor have an default value, i believe it would make these fields mandatory in cases azure auth is provided.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also added the unit test cases for the same |
||
| self._workspace_client = WorkspaceClient( | ||
| host=workspace_url, | ||
| token=personal_access_token, | ||
| product="datahub", | ||
| product_version=nice_version_name(), | ||
| ) | ||
| if azure_auth: | ||
| self._workspace_client = WorkspaceClient( | ||
| host=workspace_url, | ||
| azure_tenant_id=azure_auth.tenant_id, | ||
| azure_client_id=azure_auth.client_id, | ||
| azure_client_secret=azure_auth.client_secret.get_secret_value(), | ||
| product="datahub", | ||
| product_version=nice_version_name(), | ||
| ) | ||
| else: | ||
| self._workspace_client = WorkspaceClient( | ||
| host=workspace_url, | ||
| token=personal_access_token, | ||
| product="datahub", | ||
| product_version=nice_version_name(), | ||
| ) | ||
| self.warehouse_id = warehouse_id or "" | ||
| self.report = report | ||
| self.hive_metastore_proxy = hive_metastore_proxy | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With both being optional, could you add a pydantic validator that one or the other is required? that is, raise a validation error if both are none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added validator in metadata-ingestion/src/datahub/ingestion/source/unity/config.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
I have just merged a PR that prevents usage of pydantic v1 validators
so heads up you will need to update your
pydantic.root_validatortomodel_validatorThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up. Have updated the root_validator with model_validator