Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] .databricks.env not updating after extension configuration target change #1476

Open
reneket-nw opened this issue Dec 2, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@reneket-nw
Copy link

reneket-nw commented Dec 2, 2024

Describe the bug
my project uses metadata-service to authenticate, but when i start a new vscode session and the terminals reactivate, the new DATABRICKS_METADATA_SERVICE_URL is not propagated to .databricks/.databricks.env. i am working around this by deleting the .env file and reinitiating the extension configuration. DATABRICKS_BUNDLE_TARGET and DATABRICKS_HOST aren't being updated when switching targets, either.

additionally, vscode is not picking up .databricks/.databricks.env variable declarations when setting the enviroment variables for a python terminal. i am working around that by copying the file down to the project root.

To Reproduce
Steps to reproduce the behavior:

  1. initiate project
  2. close vscode, restart computer, etc
  3. reopen project and wait for extension to activate

workaround to get the new url in the .env file is to:

  1. remove the line "databricks.python.envFile": "${workspaceFolder}/.env", from .vscode/settings.json. skipping this step restores the previous version of the .env file, with the inactive metadata-service url.
  2. delete .databricks/.databricks.env
  3. in the databricks extension pane, under configuration, select a new target to initiate a change to the host and bundle target.
  4. copy .databricks/.databricks.env to .env, and the env file in project root gets picked up by vscode

another weird symptom is that the config setting in the docs is not recognized by the vscode settings linter. i don't know if this is related.
unknown configuration setting "databricks.python.envFile"

full output of the python terminal below. 53566 and fd53d2ad-ee05-437d-8dbe-b12383b84c4e are the port and random uuid from the previous vscode session's terminals.

PS C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor> & c:/Users/%USERNAME%/Documents/Repositories/db_resource_monitor/.venv/Scripts/python.exe c:/Users/%USERNAME%/Documents/Repositories/db_resource_monitor/src/db_resource_monitor/main.py
Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connection.py", line 199, in _new_conn
    sock = connection.create_connection(
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
    raise err
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connectionpool.py", line 789, in urlopen
    response = self._make_request(
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connectionpool.py", line 495, in _make_request    
    conn.request(
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connection.py", line 441, in request
    self.endheaders()
  File "C:\Program Files\Python310\lib\http\client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Program Files\Python310\lib\http\client.py", line 1037, in _send_output
    self.send(msg)
  File "C:\Program Files\Python310\lib\http\client.py", line 975, in send
    self.connect()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connection.py", line 279, in connect
    self.sock = self._new_conn()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connection.py", line 214, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\adapters.py", line 667, in send
    resp = conn.urlopen(
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\connectionpool.py", line 843, in urlopen
    retries = retries.increment(
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\urllib3\util\retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=53566): Max retries exceeded with url: /fd53d2ad-ee05-437d-8dbe-b12383b84c4e (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 822, in __call__
    header_factory = provider(cfg)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 89, in wrapper
    return func(cfg)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 697, in metadata_service
    token_source.token()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\oauth.py", line 201, in token
    self._token = self.refresh()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 669, in refresh
    resp = requests.get(self.url,
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\requests\adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=53566): Max retries exceeded with url: /fd53d2ad-ee05-437d-8dbe-b12383b84c4e (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 449, in init_auth
    self._header_factory = self._credentials_strategy(self)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 828, in __call__
    raise ValueError(f'{auth_type}: {e}') from e
ValueError: metadata-service: HTTPConnectionPool(host='127.0.0.1', port=53566): Max retries exceeded with url: /fd53d2ad-ee05-437d-8dbe-b12383b84c4e (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 126, in __init__
    self.init_auth()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 454, in init_auth
    raise ValueError(f'{self._credentials_strategy.auth_type()} auth: {e}') from e
ValueError: default auth: metadata-service: HTTPConnectionPool(host='127.0.0.1', port=53566): Max retries exceeded with url: /fd53d2ad-ee05-437d-8dbe-b12383b84c4e (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\src\db_resource_monitor\main.py", line 4, in <module>
    w = WorkspaceClient()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\__init__.py", line 152, in __init__
    config = client.Config(host=host,
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 130, in __init__
    raise ValueError(message) from e
ValueError: default auth: metadata-service: HTTPConnectionPool(host='127.0.0.1', port=53566): Max retries exceeded with url: /fd53d2ad-ee05-437d-8dbe-b12383b84c4e (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001C845745420>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')). Config: host=https://[host].cloud.databricks.com, auth_type=metadata-service, metadata_service_url=***. Env: DATABRICKS_HOST, DATABRICKS_AUTH_TYPE, DATABRICKS_METADATA_SERVICE_URL

if we do have the correct metadata-service url, changing the target environment without manually refreshing the .env files returns the following:

PS C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor> & C:/Users/%USERNAME%/Documents/Repositories/db_resource_monitor/.venv/Scripts/python.exe c:/Users/%USERNAME%/Documents/Repositories/db_resource_monitor/src/db_resource_monitor/main.py
Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 822, in __call__        
    header_factory = provider(cfg)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 89, in wrapper
    return func(cfg)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 697, in metadata_service
    token_source.token()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\oauth.py", line 201, in token
    self._token = self.refresh()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 678, in refresh
    raise ValueError("Metadata Service returned empty token")
ValueError: Metadata Service returned empty token

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 449, in init_auth
    self._header_factory = self._credentials_strategy(self)
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\credentials_provider.py", line 828, in __call__
    raise ValueError(f'{auth_type}: {e}') from e
ValueError: metadata-service: Metadata Service returned empty token

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 126, in __init__
    self.init_auth()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 454, in init_auth
    raise ValueError(f'{self._credentials_strategy.auth_type()} auth: {e}') from e
ValueError: default auth: metadata-service: Metadata Service returned empty token

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\src\db_resource_monitor\main.py", line 4, in <module>
    w = WorkspaceClient()
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\__init__.py", line 152, in __init__
    config = client.Config(host=host,
  File "C:\Users\%USERNAME%\Documents\Repositories\db_resource_monitor\.venv\lib\site-packages\databricks\sdk\config.py", line 130, in __init__
    raise ValueError(message) from e
ValueError: default auth: metadata-service: Metadata Service returned empty token. Config: host=https://[host from previous active target].cloud.databricks.com, auth_type=metadata-service, metadata_service_url=***. Env: DATABRICKS_HOST, DATABRICKS_AUTH_TYPE, DATABRICKS_METADATA_SERVICE_URL

Help:About

Version: 1.94.0 (system setup)
Commit: d78a74bcdfad14d5d3b1b782f87255d802b57511
Date: 2024-10-02T13:08:12.626Z
Electron: 30.5.1
ElectronBuildId: 10262041
Chromium: 124.0.6367.243
Node.js: 20.16.0
V8: 12.4.254.20-electron.0
OS: Windows_NT x64 10.0.19045

Databricks Extension Version: v2.4.8

@reneket-nw reneket-nw added the bug Something isn't working label Dec 2, 2024
@ilia-db
Copy link
Contributor

ilia-db commented Dec 13, 2024

Hi, sorry for the slow response.

In the 2+ version we no longer inject databricks env variables into terminals, or into python extension executions (e.g. when you run a python file using python extension "run" action).

Instead, we expose separate run actions for local execution:
Screenshot 2024-12-13 at 14 03 53

When you execute a python file with it, we execute it with our own python helper that extends os.environ with the variables from the .databricks/.databricks.env

@ilia-db
Copy link
Contributor

ilia-db commented Dec 13, 2024

Because of the above, the databricks.python.envFile is no longer used by the extension. But we've forgot to remove it from the .vscode/settings.json in our default python template

@reneket-nw
Copy link
Author

Thanks for the response, and apologies for the delay in my getting back to you.

What I am trying to accomplish is to use the python databricks-sdk to programmatically query some of the api's. Previously, the databricks-vscode plugin was a solution for authentication; it manages U2M tokens and starts the metadata service. It still does those things (from what I can tell), but now there is no access to the metadata service url from within the python environment when running locally.

My understanding of Databricks Connect is that it allows you to submit jobs (query, notebook code, etc) to a Databricks cluster from a local client (vscode) instead of the web interface. Since what I'm doing doesn't require spark, or a running cluster, why am I limited to only running the code using Databricks Connect?

@ilia-db
Copy link
Contributor

ilia-db commented Jan 2, 2025

"Run current file with Databricks Connect" does the same things as the old extension was doing with the python "run" button. Now it's just a separate action and explicitly named, although indeed the name is too specific. You can freely use it to run the python code that doesn't use dbconnect, it will still work and provide you with the auth.

As a possibly more fitting alternative, you can create a python/debugpy launch config with databricks: true, which will offer the same functionality (after setting it up, select this launch config before executing "run" action). Check the "Tip" in this doc page - https://docs.databricks.com/en/dev-tools/vscode-ext/run.html#create-a-custom-run-configuration

@reneket-nw
Copy link
Author

The launch config I think works closest to what I want to do (thanks for that!), but I am running into issues with authentication when running pytest now. I had previously been relying on the variables added to venv by my .env file workaround. Is there a comparable integration/configuration with pytest as well?

@ilia-db
Copy link
Contributor

ilia-db commented Jan 9, 2025

Not sure that it's possible to select launch profile when you simply run the tests, but it's possible to select the launch profile (with the databricks flag) when you debug them.

You can also enable this setting if you want 'run' actions in the gutters to trigger the debug launch config instead:
Screenshot 2025-01-09 at 12 58 37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants