-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider the ability to opt-out from automatic functionality such as automatic provider ready events. #265
Comments
I feel conflicted about this. On the one hand, I see the value and the simplicity that can be gained here for underlying SDKs that emit events. On the other hand we're adding more complexity to the SDK, and cognitive load to provider authors. One other question is I have is if this setting would also apply to if the |
I am not sure if this is true or not. I recently updated the LD node, python, java, and dotnet providers to handle this behavior and handling the mixed requirements of runtime events, initialization events, and the blocking behavior is by far the largest cognitive load in any of these provider implementations (aside from the node one, where we happen to have an underlying mechanism that matches well), and also very easy to get incorrect. This optionality may or may not be the correct way to reduce the complexity. Is optimizing for the simple case the correct approach, or is it likely that this increases complexity more often than it decreases it? Ultimately here the event channel is shared. The provider can produce events and the OF SDK can also produce events. There isn't a delineation of the types of events provided by both. So you have to encode into the provider that specific events are emitted automatically at specific points, and that those events also need to be emitted by the provider at different points. |
I think I'm nearly convinced. What's your take on this?
Do you think we should have 2 settings? 1 for preventing emission of error and 1 for emission of ready? Or can we handle both with one? Do you think this should be a field on cc @lukas-reining since I think some of his recent work with a config-cat provider runs into this exact issue. |
Having this data on the metadata feels pretty complicated to me @toddbaert @kinyoklion. The initialization function is always async in JS and in other languages often too, or it is considered sync and blocking, so could we just prohibit the Ready event from being emitted by the init method? I think this way we could still have the automatic event and prevent the double events. But I might be overseeing something. |
The problem is that the underlying means of emitting the events is typically independent of knowing when initialization is complete for the underlying SDK, and that many languages have not had good async support for as long as things like JavaScript (or at all). So you have blocking or non-blocking init (maybe by just setting a short timeout), and then you have a means to listen to events. In which case you have to use those events to orchestrate a return from the init method using threading events or other means (sometimes it can just be a task that gets resolved). The python one I was using in the original example: https://github.com/launchdarkly/openfeature-python-server/blob/243a8dc515bcd16fe8c7d5a0913fbdfe6e073aa0/ld_openfeature/provider.py#L46 It isn't just this behavior, but a number of things combine to make the SDK initialization much more complex than seems reasonable.
When the events being emitted by the underlying SDK are threaded it is a bit tricky to say what happens during init. Which is demonstrated in the approach I took in the Java version. Some of these things can be addressed by extending the functionality of the underlying vendor SDK. This will probably be what happens with the LaunchDarkly SDKs if the interface remains this way, but it does create additional barriers to adoption. |
Sure, but isn't this the typical way of handling it?
If I am not overseeing anything this can be simply handled by only listening to the first I see the point about errors, but for most languages this might be handled by the OF SDK handling any error returned/thrown by the provider Do you have an example where these do not work @kinyoklion? |
Everything can be made to work. So I have no examples where it doesn't work. My problem is with the level of complexity required. If generally this is considered an acceptable level of complexity, then there isn't much to add. (Basically I updated 3 of these in a row, and each one was very fiddly.) I am having some trouble finding things to compare to because the majority of providers seem to not support init and shutdown or not support events or not support either, or they have not been updated to the latest SDKs. (Or they are JS SDKs which is much less of an issue, though I still need to update our node provider because I have not added runtime provider status, but it is very straightforward in a single-threaded environment.) I do feel though that there are going to be a number of bugs for providers that do implement each of these functionalities. Because the most intuitive way to implement them, which is hook up an event handler and wait for initialization, isn't correct. In that it works fine, but does result in duplicate events. |
I have seen in the examples you showed that these are also having the internal provider state still.
I am having a hard time with this. JS is still concurrent and depending on what happens in the background, events might be emitted out of order at any time. If it really is a multi-threaded lang and things are happening at the same time we would have to protect it with a mutex or so, but I do not see a real difference between the concurrent and parallel case for this.
I fear that if we have those options for opting out of automatic events, there will be many errors because it is just so easy to oversee. |
In a concurrent language you can check a state and register a handler and there is no race condition as long as you do not await as a micro-task cannot be executed between those. In a multi-threaded system they can, and the state that needs locked is internal to the underlying SDK. The event dispatch itself may be many threads. Which is why you have to introduce additional synchronization. In regards to de-duplicating the event may be in a thread that has not gotten scheduled. So it actually emits after init. (Assuming the provider didn't use the event to complete the init function.) So it may be a more general de-duplicate ready when already ready. |
The correct way to use the LD Java SDK directly is this:
It has a blocking constructor that waits up to 10 seconds by default. You can make it longer if you want it to be longer. You can also independently track the current status, in case you need to know when it encounters problems. There is no need to use the status to wait for initialization. If you want non-blocking initialization you can set the time to wait to 0, and then handle things on flag changes or status changes. I think I am going to change the LaunchDarkly providers back to just using the blocking constructor. I don't think it truly matches the spirit of the setProvider/setProviderAndAwait methods, but I know it is safe and predictable and keeps provider simplicity low. |
But the point is that it is most likely that there will be an await happening so race conditions also have to be expected here.
Yes, but to me it would be fine if the SDK implementation had a critical path between receiving the first
This is a really good point. Maybe this can not really be solved elegantly. One potentially could in these cases always filter out READY events that happened before the init finished AND if none happened until the provider init finished, the first ready event of the provider regardless of the provider status. If we can not solve it in i good way, I guess having the metadata options seems okay.
Because listening to events is not reliable or what is the reasoning? |
Regardless what the correct solution is here I don't think that the current solution has the correct level of complexity and safety. So I am considering using what I know is the safest path for someone consuming the LD SDK, which is also the simplest way of using it. (Safety here mainly being, initialization occurs within some known bounded time. Secondarily has logical events.) Though I don't know that I can actually make this change upon review. I recently released 1.0 versions of our providers after adding init and shutdown. I would consider changing this a breaking behavior (In most cases the SDK is initialized by the time it hits the initialize method, but sometimes it wouldn't be, and in that case the behavior would be different). To re-iterate, I think that having an option is one solution to this. But the problem I think is multiple owners of the same data and responsibilities. To which there could be other solutions. The option allows it to be either fully controlled by the provider or fully controlled by the OF SDK (for simple providers). If it was always fully controlled by the OF SDK always, even for runtime changes, that would be fine as well. In that case the provider would be informing the SDK of changes in state and the SDK would decide which ones to emit based on the current state. Which, as it is also tracking the state, it has something to lock and compare. It also requires an ordering guarantee of the underlying SDK events, which most SDKs should have or it would make them hard to use. |
Definitely see the point.
I see the point and totally support that. |
OK I'm going to try to summarize so far.
This is a problem. We can approach this in one of two ways:
I'm open to either direction here, but I think I slightly favor keeping things as they are. @Kavindu-Dodan @thomaspoignant please weigh in. Keep in mind, one of the values of the project is to keep things simple and consistent for application authors, even if it comes at the cost of complexity for provider authors. |
Interesting discussion. I think the underlying issue is there even in flagd provider. we have some special logic to avoid double ready events. For example, in Java [1] we use custom state change and in Go [2], the provider internally omits the initial ready event.
I think this could solve the provider implementation complexity and simplify the initialization ready (or error) events.
In go, we have an event channel that needs to be exposed by the provider if it implements eventing. This has some advantages. But still, like in flagd Go provider [2], special logic is needed to filter events for startup.
If we provide an option to skip startup events, then complexity is at SDK maintainers. But I think this is fine. Eventing implementations are already somewhat complex :) [1] - https://github.com/open-feature/java-sdk-contrib/blob/main/providers/flagd/src/main/java/dev/openfeature/contrib/providers/flagd/FlagdProvider.java#L160-L163 |
As a value proposition I think that complexity in the OF SDK is better than provider complexity which is better than application author complexity. Solving the problem fewer times will generally mean fewer, or at least more easily remedied, problems for application authors. It will also mean fewer footguns for provider authors. In this case the proposal here doesn't remove a footgun, it basically lets you toggle the footgun off after you get shot with it. Which is why it is still not ideal, but I am not sure of a better solution.
The change this makes in the LaunchDarkly provider is that it would remove shared state and just allow setting a thread monitor (or completeable future) in a single event handler unconditionally and also emitting the event. It removes the decision about also emitting that event based on state. So it appreciably decreases complexity. It also provides better behavior in the more simplified case (which I will probably switch to still). The blocking constructor could timeout, we attach up event handlers, and then throw. If the SDK does eventually initialize, then it can emit a ready event. The first ready/error event doesn't have any special behavior to consider. |
For providers which support changing their provider status at runtime the automatic events introduce additional complexity.
The provider needs to defer its handling of the status until after the automatic status provided by the OpenFeature API. Otherwise it would emit duplicate provider ready or error events. The handling can be somewhat complex for multi-threaded providers and SDKs.
It would be nice if the provider had meta-data, or other fields, which could allow for opting out of the automatic behavior. This may then be extended to other automatic functionality in the future.
Related discussion: open-feature/dotnet-sdk#276 (comment)
The text was updated successfully, but these errors were encountered: