Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Cancelling dbt invocation does not cancel the query on Athena #406

Open
2 tasks done
Jrmyy opened this issue Dec 26, 2024 · 6 comments · May be fixed by #873
Open
2 tasks done

[Bug] Cancelling dbt invocation does not cancel the query on Athena #406

Jrmyy opened this issue Dec 26, 2024 · 6 comments · May be fixed by #873
Labels
pkg:dbt-athena Issue affects dbt-athena type:bug Something isn't working as documented

Comments

@Jrmyy
Copy link
Contributor

Jrmyy commented Dec 26, 2024

👋🏻 Hello dbt-athena community,

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When I try to cancel a dbt invocation, the invocation can be cancelled but the query is not cancelled in Athena. Therefore it can lead to concurrent requests when retrying.

Expected Behavior

When the dbt invocation cancel process is done, the query should be cancelled in Athena

Steps To Reproduce

  1. Run a model
  2. Cancel the invocation (for instance with CTRL + c)
  3. Check in Athena console if the query is still running or not

Relevant log output

No response

Environment

- OS: MacOS (the same problem in Docker image based on Debian)
- Python: 3.12.7
- dbt-core: 1.8.9
- dbt-athena: 1.8.4

Additional Context

No response

@Jrmyy Jrmyy added type:bug Something isn't working as documented triage:product In Product's queue labels Dec 26, 2024
@Jrmyy
Copy link
Contributor Author

Jrmyy commented Dec 26, 2024

The adapter cancel behavior is defined here:

def cancel_open_connections(self):

Then if we go to the SQLConnectionsManager, the cancel_open is defined here:

def cancel_open(self) -> List[str]:

But in the AthenaConnectionsManager, we don't implement the cancel method: https://github.com/dbt-labs/dbt-athena/blob/8e2aa424256e354103d619cc07f9a79d85fadf98/dbt-athena/src/dbt/adapters/athena/connections.py#L321-L322

The Connection object comes directly from pyathena: https://github.com/laughingman7743/PyAthena/blob/master/pyathena/connection.py#L42

@amychen1776 amychen1776 removed the triage:product In Product's queue label Jan 6, 2025
@mikealfare mikealfare added the pkg:dbt-athena Issue affects dbt-athena label Jan 10, 2025
@mikealfare mikealfare transferred this issue from dbt-labs/dbt-athena Jan 13, 2025
@Jrmyy
Copy link
Contributor Author

Jrmyy commented Jan 16, 2025

After deep-diving the code I think I found the issue.

The cancel query happens here: https://github.com/dbt-labs/dbt-core/blob/3de3b827bfffdc43845780f484d4d53011f20a37/core/dbt/task/runnable.py#L429-L448

As it should be, the raise of the exception is done after the cancel of the queries is done.
In dbt-athena, we wait for this exception to try to cancel the query. But it is never caught by our code. I think it is because the KeyboardInterrupt exception is sent to the main thread while the run of the dbt-model is sent to a thread run

https://github.com/dbt-labs/dbt-adapters/blob/main/dbt-athena/src/dbt/adapters/athena/connections.py#L213

Therefore, I think implemented the cancel method will fix the error. But I also think the _poll method is not working as expected (https://github.com/dbt-labs/dbt-adapters/blob/main/dbt-athena/src/dbt/adapters/athena/connections.py#L128-L138).

@mikealfare
Copy link
Contributor

We had a similar scenario in dbt-bigquery. We ended up needing to track job_id's by connection so that we could cancel each job_id prior to closing the connection. Here's how we handled it:

def cancel_open(self):
names = []
this_connection = self.get_if_exists()
with self.lock:
for thread_id, connection in self.thread_connections.items():
if connection is this_connection:
continue
if connection.handle is not None and connection.state == ConnectionState.OPEN:
client: Client = connection.handle
for job_id in self.jobs_by_thread.get(thread_id, []):
with self.exception_handler(f"Cancel job: {job_id}"):
client.cancel_job(
job_id,
retry=self._retry.create_reopen_with_deadline(connection),
)
self.close(connection)
if connection.name is not None:
names.append(connection.name)
return names

@Jrmyy
Copy link
Contributor Author

Jrmyy commented Feb 27, 2025

👋🏻 Hello

Thanks @mikealfare for your message. It helped a lot. I deep dived a little bit the big query client implementation and I think closing all cursors on a connection close, as it is done on the BQ client should fix the issue. I am a little bit curious now on why this is not enough on the big query adapter to just call the cancel method of the Connection?

I opened an issue on PyAthena: laughingman7743/PyAthena#575

@Jrmyy
Copy link
Contributor Author

Jrmyy commented Mar 3, 2025

Hum after checking the owner of PyAthena, python-bigquery client does not cancel the query on cancelling the cursor (it just marks the cursor as closed). Now I understand why there is a need to keep track of the job ids. I have to check if we could replicate this by retrieving the query execution id

@Jrmyy
Copy link
Contributor Author

Jrmyy commented Mar 4, 2025

The main issue is that:

  • Only the PyAthena cursor has access to the current query execution id we would to cancel
  • In the connection manager we only have access to the connection we want to close and not the cursor

So a solution would be to store at some point the current cursor used and to cancel it if we call the cancel method.

@Jrmyy Jrmyy linked a pull request Mar 5, 2025 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg:dbt-athena Issue affects dbt-athena type:bug Something isn't working as documented
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants