Fix DatabricksSqlHook sqlalchemy_url to include http_path from connection extra#69037
Fix DatabricksSqlHook sqlalchemy_url to include http_path from connection extra#69037Subham-KRLX wants to merge 1 commit into
Conversation
… from connection extra
moomindani
left a comment
There was a problem hiding this comment.
Thanks — centralizing the resolution into _resolve_http_path() and reusing it from both get_conn() and sqlalchemy_url is the right direction, and narrowing AirflowException to ValueError matches the repo's current exception guidelines. The connection-extra case (the one #69031 reports) works and is covered by your tests.
However, there is a regression for hooks configured via sql_endpoint_name: sqlalchemy_url (and therefore get_uri()) now performs a live "list SQL warehouses" REST API call inside a property. DbApiHook.dialect_name calls make_url(self.get_uri()) (common.sql sql.py:363), so every insert_rows() — and any other dialect-dependent path — now triggers that network call and fails hard when the API is unreachable. Before this change those paths worked offline (the URL simply lacked http_path, which dialect_name does not need). The existing unit test test_insert_rows_hook_lineage fails on this branch with exactly that chain:
insert_rows → _generate_insert_sql → dialect → dialect_name
→ get_uri → sqlalchemy_url → _resolve_http_path → _get_sql_endpoint_by_name → REST call
Suggestion: keep the API lookup out of the property. For example, give the helper a flag — _resolve_http_path(allow_endpoint_lookup: bool = True) — where get_conn() uses the default, and sqlalchemy_url passes allow_endpoint_lookup=False, falling back to the previous behavior of omitting http_path when only sql_endpoint_name is configured and no cached value exists yet (raising there would reject a valid configuration). The connection-extra resolution — the actual #69031 fix — needs no API call and stays as-is.
It would also be good to add a test where a sql_endpoint_name-configured hook calls get_uri() (or insert_rows()) offline — that would have caught this.
Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting
|
|
||
| url_query = { | ||
| "http_path": self._http_path, | ||
| "http_path": self._resolve_http_path(), |
There was a problem hiding this comment.
This line makes sqlalchemy_url (a property) perform a REST API call when the hook is configured via sql_endpoint_name — see the review body for the failing chain through dialect_name/insert_rows. Please resolve only from local sources (explicit param, cached value, connection extra) here.
Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting
While it would fix #69031, I believe the error will still be there if one defines the hook with
Would calling |
This PR fixes #69031 by ensuring that
DatabricksSqlHook.sqlalchemy_urlandget_uri()correctly includehttp_pathwhen it is defined in the connectionextrafields.We centralized the resolution logic into a helper
_resolve_http_path(), updated bothget_conn()andsqlalchemy_urlto use it, and added corresponding unit tests.closes: #69031
Was generative AI tooling used to co-author this PR?