-
Notifications
You must be signed in to change notification settings - Fork 786
Fix(orca-front50): front50 timeout fallback #4851
Changes from 3 commits
a260018
943fd62
8c0af21
3ea06c5
28cad45
25c9a38
05e309c
109c125
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,6 +19,30 @@ | |
| import lombok.Data; | ||
| import org.springframework.boot.context.properties.ConfigurationProperties; | ||
|
|
||
| /** | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't decide whether I think this comment is helpful enough vs. the chances of it getting stale. Since these are the higher-precedence properties, I'm pretty tempted to ditch the comment. |
||
| * Configuration properties for Front50 service. | ||
| * | ||
| * <p>These properties can be configured in your YAML configuration: | ||
| * | ||
| * <pre> | ||
| * front50: | ||
| * baseUrl: http://front50.example.com | ||
| * enabled: true | ||
| * useTriggeredByEndpoint: true | ||
| * okhttp: | ||
| * connectTimeoutMs: 10000 | ||
| * readTimeoutMs: 60000 | ||
| * writeTimeoutMs: 60000 | ||
| * </pre> | ||
| * | ||
| * <p>If not explicitly configured, a fallback chain will be used for timeouts: | ||
| * | ||
| * <ol> | ||
| * <li>Use explicit okhttp configuration if present | ||
| * <li>Fall back to global okhttp client configuration | ||
| * <li>Use default fallback values (10s connect, 60s read/write) | ||
| * </ol> | ||
| */ | ||
| @Data | ||
| @ConfigurationProperties("front50") | ||
| public class Front50ConfigurationProperties { | ||
|
|
@@ -34,14 +58,50 @@ public class Front50ConfigurationProperties { | |
| */ | ||
| boolean useTriggeredByEndpoint = true; | ||
|
|
||
| /** HTTP client configuration for connecting to Front50 service */ | ||
| OkHttpConfigurationProperties okhttp = new OkHttpConfigurationProperties(); | ||
|
|
||
| /** | ||
| * Configuration properties for the OkHttp client connecting to Front50. These will only be used | ||
| * if explicitly set in the configuration. Otherwise, global client timeouts will be used as | ||
| * fallback. | ||
| */ | ||
| @Data | ||
| public static class OkHttpConfigurationProperties { | ||
| int readTimeoutMs = 10000; | ||
| /** Read timeout in milliseconds. Default is 60 seconds (60000ms) */ | ||
| private Integer readTimeoutMs = 60000; | ||
|
|
||
| /** Write timeout in milliseconds. Default is 60 seconds (60000ms) */ | ||
| private Integer writeTimeoutMs = 60000; | ||
|
|
||
| /** Connection timeout in milliseconds. Default is 10 seconds (10000ms) */ | ||
| private Integer connectTimeoutMs = 10000; | ||
|
|
||
| /** | ||
| * Checks if read timeout is explicitly configured with a non-default value. | ||
| * | ||
| * @return true if read timeout is configured with a non-default value | ||
| */ | ||
| public boolean hasReadTimeoutConfig() { | ||
| return readTimeoutMs != null && readTimeoutMs != 60000; | ||
| } | ||
|
|
||
| int writeTimeoutMs = 10000; | ||
| /** | ||
| * Checks if write timeout is explicitly configured with a non-default value. | ||
| * | ||
| * @return true if write timeout is configured with a non-default value | ||
| */ | ||
| public boolean hasWriteTimeoutConfig() { | ||
| return writeTimeoutMs != null && writeTimeoutMs != 60000; | ||
| } | ||
|
|
||
| int connectTimeoutMs = 10000; | ||
| /** | ||
| * Checks if connect timeout is explicitly configured with a non-default value. | ||
| * | ||
| * @return true if connect timeout is configured with a non-default value | ||
| */ | ||
| public boolean hasConnectTimeoutConfig() { | ||
| return connectTimeoutMs != null && connectTimeoutMs != 10000; | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| /* | ||
| * Copyright 2024 Armory, Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package com.netflix.spinnaker.orca.front50.config | ||
|
|
||
| import com.fasterxml.jackson.databind.ObjectMapper | ||
| import com.netflix.spinnaker.config.DefaultServiceEndpoint | ||
| import com.netflix.spinnaker.config.okhttp3.OkHttpClientProvider | ||
| import okhttp3.OkHttpClient | ||
| import org.slf4j.Logger | ||
| import retrofit.Endpoint | ||
| import retrofit.RequestInterceptor | ||
| import retrofit.RestAdapter | ||
| import spock.lang.Specification | ||
| import spock.lang.Subject | ||
| import spock.lang.Unroll | ||
|
|
||
| import java.lang.reflect.Method | ||
| import java.util.concurrent.TimeUnit | ||
|
|
||
| /** | ||
| * Tests the timeout fallback chain logic in the Front50Configuration class. | ||
| */ | ||
| class Front50ConfigurationSpec extends Specification { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you mind converting this to java please. The less groovy code we have in the world, the better.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, let me know if it looks better now :)
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Still looks like groovy to me... |
||
|
|
||
| OkHttpClientProvider clientProvider = Mock() | ||
| RequestInterceptor requestInterceptor = Mock() | ||
| Endpoint endpoint = Mock() | ||
|
|
||
| @Subject | ||
| Front50Configuration front50Configuration | ||
|
|
||
| // Access private getEffectiveTimeout method via reflection | ||
| private Method getEffectiveTimeoutMethod | ||
|
|
||
| // Default timeout values from Front50Configuration | ||
| private static final long DEFAULT_READ_TIMEOUT_MS = 60000 // 60 seconds | ||
| private static final long DEFAULT_WRITE_TIMEOUT_MS = 60000 // 60 seconds | ||
| private static final long DEFAULT_CONNECT_TIMEOUT_MS = 10000 // 10 seconds | ||
|
|
||
| def setup() { | ||
| front50Configuration = new Front50Configuration() | ||
| front50Configuration.clientProvider = clientProvider | ||
| front50Configuration.spinnakerRequestInterceptor = requestInterceptor | ||
|
|
||
| // Use reflection to access the private getEffectiveTimeout method | ||
| getEffectiveTimeoutMethod = Front50Configuration.class.getDeclaredMethod( | ||
| "getEffectiveTimeout", | ||
| Integer.class, | ||
| long.class, | ||
| long.class, | ||
| String.class) | ||
| getEffectiveTimeoutMethod.setAccessible(true) | ||
| } | ||
|
|
||
| @Unroll | ||
| def "getEffectiveTimeout should use #description when explicitTimeout=#explicitTimeout, globalTimeout=#globalTimeout, timeoutType=#timeoutType"() { | ||
| given: | ||
| when: | ||
| long result = getEffectiveTimeoutMethod.invoke( | ||
| front50Configuration, | ||
| explicitTimeout, | ||
| globalTimeout, | ||
| defaultTimeout, | ||
| timeoutType) | ||
|
|
||
| then: | ||
| result == expectedTimeout | ||
|
|
||
| where: | ||
| description | explicitTimeout | globalTimeout | defaultTimeout | timeoutType || expectedTimeout | ||
| "custom explicit timeout" | 30000 | 20000 | 60000 | "read" || 30000 | ||
| "global timeout - read" | 60000 | 20000 | 60000 | "read" || 20000 // Default read timeout is ignored | ||
| "global timeout - write" | 60000 | 20000 | 60000 | "write" || 20000 // Default write timeout is ignored | ||
| "global timeout - connect"| 10000 | 20000 | 60000 | "connect" || 20000 // Default connect timeout is ignored | ||
| "default timeout" | null | 0 | 60000 | "read" || 60000 | ||
| "zero explicit timeout" | 0 | 20000 | 60000 | "read" || 0 // Zero is treated as explicitly configured | ||
| } | ||
|
|
||
| @Unroll | ||
| def "configureTimeouts should handle different combinations of configurations"() { | ||
| given: | ||
| Front50ConfigurationProperties props = new Front50ConfigurationProperties() | ||
| if (explicitTimeoutsSet) { | ||
| Front50ConfigurationProperties.OkHttpConfigurationProperties okHttpProps = | ||
| new Front50ConfigurationProperties.OkHttpConfigurationProperties() | ||
| okHttpProps.setReadTimeoutMs(readTimeoutMs) | ||
| okHttpProps.setWriteTimeoutMs(writeTimeoutMs) | ||
| okHttpProps.setConnectTimeoutMs(connectTimeoutMs) | ||
| props.setOkhttp(okHttpProps) | ||
| } | ||
|
|
||
| OkHttpClient baseClient = new OkHttpClient.Builder() | ||
| .readTimeout(globalReadTimeoutMs, TimeUnit.MILLISECONDS) | ||
| .writeTimeout(globalWriteTimeoutMs, TimeUnit.MILLISECONDS) | ||
| .connectTimeout(globalConnectTimeoutMs, TimeUnit.MILLISECONDS) | ||
| .build() | ||
|
|
||
| OkHttpClient configuredClient = new OkHttpClient.Builder().build() | ||
|
|
||
| clientProvider.getClient(_ as DefaultServiceEndpoint) >> baseClient | ||
| endpoint.getUrl() >> "http://front50.example.com" | ||
|
|
||
| // Use reflection to access the private configureTimeouts method | ||
| Method configureTimeoutsMethod = Front50Configuration.class.getDeclaredMethod( | ||
| "configureTimeouts", | ||
| OkHttpClient.class, | ||
| Front50ConfigurationProperties.class) | ||
| configureTimeoutsMethod.setAccessible(true) | ||
|
|
||
| when: | ||
| configuredClient = configureTimeoutsMethod.invoke( | ||
| front50Configuration, | ||
| baseClient, | ||
| props) as OkHttpClient | ||
|
|
||
| then: | ||
| configuredClient.readTimeoutMillis() == expectedReadTimeoutMs | ||
| configuredClient.writeTimeoutMillis() == expectedWriteTimeoutMs | ||
| configuredClient.connectTimeoutMillis() == expectedConnectTimeoutMs | ||
|
|
||
| where: | ||
| explicitTimeoutsSet | readTimeoutMs | writeTimeoutMs | connectTimeoutMs | globalReadTimeoutMs | globalWriteTimeoutMs | globalConnectTimeoutMs || expectedReadTimeoutMs | expectedWriteTimeoutMs | expectedConnectTimeoutMs | ||
| true | 30000 | 35000 | 5000 | 20000 | 25000 | 15000 || 30000 | 35000 | 5000 | ||
| true | null | 35000 | null | 20000 | 25000 | 15000 || 20000 | 35000 | 15000 | ||
| true | null | null | null | 20000 | 25000 | 15000 || 20000 | 25000 | 15000 | ||
| false | null | null | null | 20000 | 25000 | 15000 || 20000 | 25000 | 15000 | ||
| false | null | null | null | 0 | 0 | 0 || DEFAULT_READ_TIMEOUT_MS | DEFAULT_WRITE_TIMEOUT_MS | DEFAULT_CONNECT_TIMEOUT_MS | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoded values here, using the constants defined (i.e. DEFAULT_READ_TIMEOUT_MS etc) would be more meaningful I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I address this issue :)
Let me know if anything else should be addressed @kirangodishala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand enough about the issue. I need to see that release note. What was the behavior before? What is the new behavior? Would we be better off reverting something? This PR is filled with so many hard-coded magic numbers and special logic that I'd really love to find a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR #4773 introduced batch pipeline updates which is valuable functionality. Our changes preserve this functionality while ensuring reliable timeout handling.
What our implementation does:
The approach we've taken uses the existing configuration capabilities while making the behavior more consistent and predictable. The special logic is necessary to handle different configuration scenarios while ensuring optimal performance.
With our implementation, users have flexibility to:
The tests we've added verify all these scenarios work correctly. We believe this implementation strikes the right balance between functionality, reliability, and backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any issue with not going forward with this PR and let's make sure next if any breaking change let's at least add some tests! Thank you!
Working on the release notes in the next hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shlomodaari Not sure why spinnakerbot closed spinnaker/spinnaker.io#504. I re-opened it. Can you take a look at spinnaker/spinnaker.io#504 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will take a look at this. Do we have an ETA for reviewing this PR as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for updating spinnaker/spinnaker.io#504. I think I understand it now. One benefit of that PR is that it lets people configure different timeouts when orca communicates with front50 than the "generic" ok-http-client timeouts, but I agree it was a breaking change, and it'd be great to make it less breaking, or not breaking at all.
We could help one class of users (the ones that didn't specify ok-http-client timeouts / used the defaults there), but making the defaults in Front50ConfigurationProperties.OkHttpConfigurationProperties match the defaults in OkHttpClientConfigurationProperties.
If we did that, users that specified something for ok-http-client still see a change in behavior. But isn't it relatively straightforward to fix that? By adding an OkHttpClientConfigurationProperties argument to Front50Configuration.front50Service and doing something like f12cd1f, I think we'd get there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback on spinnaker/spinnaker.io#504!
In PR #4851, we've implemented a solution that handles all three user scenarios with clear priority order:
If a user has default OkHttp settings (no custom configuration):
If a user has custom global OkHttp settings:
If a user has Front50-specific settings configured:
The priority order is clear:
This approach ensures backward compatibility while preserving the ability to have service-specific configuration when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is too complicated/brittle. I'd like to see a PR that has no additional "magic numbers" in it. It still seems to be that we could pull this off with two commits: