Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC-5479: Context #5479

Merged
merged 2 commits into from
Dec 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions core/src/docs/rfcs/5479_context.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
- Proposal Name: `context`
- Start Date: 2024-12-30
- RFC PR: [apache/opendal#5480](https://github.com/apache/opendal/pull/5480)
- Tracking Issue: [apache/opendal#5479](https://github.com/apache/opendal/issues/5479)

# Summary

Add `Context` in opendal to distribute global resources like http client, runtime, etc.

# Motivation

OpenDAL now includes two global resources, the `http client` and `runtime`, which are utilized by the specified service across all enabled layers.

However, it's a bit challenging for layers to interact with these global resources.

## For http client

Layers cannot directly access the HTTP client. The only way to interact with the HTTP client is through the service builder, such as [`S3::http_client()`](https://docs.rs/opendal/latest/opendal/services/struct.S3.html#method.http_client). Layers like logging and metrics do not have direct access to the HTTP client.

Users need to implement the `HttpFetcher` trait to interact with the HTTP client. However, the drawback is that users lack context for the given requests; they do not know which service the request originates from or which operation it is performing.

## For runtime

OpenDAL has the [`Execute`](https://docs.rs/opendal/latest/opendal/trait.Execute.html) for users to implement so that they can interact with the runtime. However, the API is difficult to use, as layers need to extract and construct the `Executor` for every request.

For example:

```rust
async fn read(&self, path: &str, mut args: OpRead) -> Result<(RpRead, Self::Reader)> {
if let Some(exec) = args.executor().cloned() {
args = args.with_executor(Executor::with(TimeoutExecutor::new(
exec.into_inner(),
self.io_timeout,
)));
}

self.io_timeout(Operation::Read, self.inner.read(path, args))
.await
.map(|(rp, r)| (rp, TimeoutWrapper::new(r, self.io_timeout)))
}
```

# Guide-level explanation

So I propose to add a `Context` to OpenDAL to distribute global resources like the HTTP client and runtime.

The `Context` is a struct that contains the global resources, such as the HTTP client and runtime. It is passed to the service builder and layers so that they can interact with the global resources.

```rust
let mut ctx = Context::default();
ctx.set_http_client(my_http_client);
ctx.set_executor(my_executor);

let op = op.with_context(ctx);
```

The following API will be added:

- new struct `Context`
- `Context::default()`
- `Context::load_http_client(&self) -> HttpClient`
- `Context::load_executor(&self) -> Executor`
- `Context::update_http_client(&self, f: impl FnOnce(HttpClient) -> HttpClient)`
- `Context::update_executor(&self, f: impl FnOnce(Executor) -> Executor)`
- `Operator::with_context(ctx: Context) -> Operator`

The following API will be deprecated:

- `Operator::default_executor`
- `Operator::with_default_executor`
- `OpRead::with_executor`
- `OpRead::executor`
- `OpWrite::with_executor`
- `OpWrite::executor`
- All services builders' `http_client` API

# Reference-level explanation

We will add `Context` struct in `AccessInfo`. Every service must use `Context::default()` for `AccessInfo` and stores the same instance of `Context` in the service core. All the following usage of http client or runtime should be through the `Context` instead.

The `Context` itself is a struct wrapped by something like `ArcSwap<T>`, allowing us to update it atomically.

The layers will switch to `Context` to get the global resources instead of `OpRead`.

We no longer need to hijack the read operation.

```rust
- async fn read(&self, path: &str, mut args: OpRead) -> Result<(RpRead, Self::Reader)> {
- if let Some(exec) = args.executor().cloned() {
- args = args.with_executor(Executor::with(TimeoutExecutor::new(
- exec.into_inner(),
- self.io_timeout,
- )));
- }
-
- ...
- }
```

Instead, we can directly get the executor from the `Context` during `layer`.

```rust
impl<A: Access> Layer<A> for TimeoutLayer {
type LayeredAccess = TimeoutAccessor<A>;

fn layer(&self, inner: A) -> Self::LayeredAccess {
inner
.info()
.context()
.update_executor(|exec| Executor::with(TimeoutExecutor::new(exec, self.io_timeout)));

TimeoutAccessor {
inner,

timeout: self.timeout,
io_timeout: self.io_timeout,
}
}
}
```

# Drawbacks

A bit cost (`50ns`) for every operation that `load_http_client`.

# Rationale and alternatives

None.

# Prior art

None.

# Unresolved questions

None.

# Future possibilities

None.
4 changes: 4 additions & 0 deletions core/src/docs/rfcs/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -244,3 +244,7 @@ pub mod rfc_5314_remove_metakey {}
/// Operator from uri
#[doc = include_str!("5444_operator_from_uri.md")]
pub mod rfc_5444_operator_from_uri {}

/// Context
#[doc = include_str!("5479_context.md")]
pub mod rfc_5479_context {}
Loading