-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.http
399 lines (352 loc) · 14.6 KB
/
README.http
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
## Welcome
#This document guides you through the various interfaces exposed by a typical Steadybit extension.
#
#We recommend that you use httpYac to view the document.
#The easiest way to do this (with a running instance of this extension) is to use Gitpod:
#
#http://gitpod.io/#https://github.com/steadybit/extension-scaffold/blob/main/readme.http
#
## How Extensions work
#Extensions implement a well-defined HTTP interface that the agent uses to control the
#extension. Extensions are deployed alongside the agent on your infrastructure. Steadybit
#doesn't care how you implement or deploy the extension.
#The extensions we're providing are implemented using Go and packaged as container images.
#
## Landing
#At the root path of extensions, extensions report what capabilities they support. They
#do so through HTTP endpoint definitions that the agent can inspect to learn more.
#
#Try it! Click the little play button next to the following HTTP call.
###
GET http://localhost:8080
###
#You see the capabilities of the Robot's extension which consists of
#
#- added Discovery for a new target type (using DiscoveryKit),
#- added Log-action (using ActionKit) and
#- an event listener (using EventKit) and
#- an added piece of advice (using AdviceKit).
#- added Preflight Action (using PreflightKit).
#
#We will cover each of them subsequently.
#
#----------------------------------------------------------------------------------
#
#
## Discovery
#Discovery is where Steadybit looks at all your systems and identifies the targets
#that may be used in an action. The Steadybit DiscoveryKit enables the extension of
#Steadybit with new discovery capabilities. For example, DiscoveryKit can be used
#to author open/closed source discoveries for:
#
# - proprietary technology,
# - non-natively supported open-source tech,
# - hardware components and
# - every other "thing" you would want to see and attack with Steadybit.
#
#Our scaffolding extension implements a discovery logic for robots.
#Before discovering the actual robots, we define how robot-targets are described in
#Steadybit by defining the
#
#- target types
#- target attributes
#
### Target Description: Types
#The following HTTP call exposes the description of a robot target type. The target description
#specifies how the platform should display targets in the user interface. All
#actions are associated with a single target type. Among others, this helps
#to narrow down the targets for an action.
###
GET http://localhost:8080/com.steadybit.extension_scaffold.robot/discovery/target-description
###
#You can see that the robot targets are described using two attributes (`steadybit.label`, `robot.reportedBy`).
#These can be detailed using the target attributes interface.
#
### Target Description: Attributes
#At last, you can provide information about additional supported attributes. More
#specifically, it informs the platform about human-readable labels.
###
GET http://localhost:8080/com.steadybit.extension_scaffold.robot/discovery/attribute-descriptions
###
### Discovery Description
#Once the target description is known, Steadybit needs to know how to discover the actual targets.
#Discovery descriptions expose information about the endpoint, the call interval
#and an optional restriction where to run the discovery.
#
#For more information, see the [DiscoveryKit docs](https://github.com/steadybit/discovery-kit/blob/main/docs/discovery-api.md#discovery-description).
###
GET http://localhost:8080/com.steadybit.extension_scaffold.robot/discovery
###
#Robots will be discovered once every minute by calling the `discover/path`-HTTP endpoint.
#Discoveries are always scheduled by the agent and in our case only by the leader-agent.
#
### Do the magic: Discover Robot Targets
#Finally, let's discover all robots!
#By calling the endpoint below Steadybit receives a list of all discovered robots which will be
#accessible within the Steadybit platform.
###
GET http://localhost:8080/com.steadybit.extension_scaffold.robot/discovery/discovered-targets
###
#----------------------------------------------------------------------------------
#
#
## Actions
#Attacks, checks, running a load test - all these are actions. So basically,
#every step in an experiment is an action from an implementation perspective.
#Attacks act upon targets from the discovery (needed for RBAC), while other
#actions may or may not do this.
#
#Extension can contribute custom actions by implementing the ActionKit interface.
#
#An action describes itself and is divided into prepare, start, status and stop steps
#that you need to implement. If you need to pass around some state between those, the
#agent manages that state for the extension. A defined lifecycle is crucial for
#rolling back attacks and cleaning up any allocated resources. We don’t want to
#run arbitrary shell scripts and leave a messy system behind.
#
### Action Description
#The following HTTP call exposes the action description. The action description
#is used to provide meta data about the action, e.g., for presentation within the
#user interface and for lifecycle management.
###
GET http://localhost:8080/com.steadybit.extension_scaffold.robot.log
###
#Our robot's log action is associated to the category `other` and implements a target-selection-template
#for helping users to define target queries when using the action.
#Furthermore, it reference each of the below described methods of an action lifecycle.
#
### Action Lifecycle
#Action executions flow through a standardized lifecycle. This standard process enables
#Steadybit to handle several critical aspects for you, e.g., rollback triggering and
#recovery in case of extension crashes/preemption. This document only provides a rough
#overview of the supported lifecycle handlers. For more details, please refer to the
#[ActionKit](https://github.com/steadybit/action-kit/blob/main/docs/action-api.md) documentation.
#
#
#### Prepare
#The preparation (or short prepare) step receives the action's configuration options
#(representing the parameters defined in the action description) and a selected target.
#The HTTP endpoint must respond with an HTTP status code 200 and a JSON response body
#containing a state object.
#
#The state object is later used in HTTP requests to the start and stop endpoints. So you
#will want to include all the execution relevant information within the state object, e.g.,
#a subset of the target's attributes, the configuration options and the original state
#(in case you are going to do some system modification as part of the start step).
###
POST http://localhost:8080/com.steadybit.extension_scaffold.robot.log/prepare
Content-Type: application/json
{
"target": {
"name": "R2-D2",
"attributes": {}
},
"config": {
"message": "Hello from %s!"
}
}
###
#### Start
#The actual action happens within the start step, i.e., this is where you will typically
#modify the system, kill processes or reboot servers.
#
#The start step receives the prepare step's state object. The HTTP endpoint must respond
#with an HTTP status code 200 on success. A JSON response body containing a state object
#may be returned. This state object is later passed to the stop step.
#
#This endpoint must respond within a few seconds. It is not permitted to block until the
#action execution is completed within the start endpoint. For example, you can trigger a
#deployment change within the start endpoint, but the start endpoint may not block until
#the deployment change is fully rolled out (this is what the status endpoint is for).
###
POST http://localhost:8080/com.steadybit.extension_scaffold.robot.log/start
Content-Type: application/json
{
"state": {
"FormattedMessage": "Hello from R2-D2!"
}
}
###
#### Status
#The status step exists to observe the status of the action execution. For example, when
#triggering a deployment change you would use the status endpoint to inspect whether the
#deployment change was processed.
#
#The status step receives the prepare, start or previous state step's state object. The
#HTTP endpoint must respond with an HTTP status code 200 on success.
#
#This endpoint must respond within a few seconds. It is not permitted to block until the
#action execution is completed within the status endpoint. For example, you can inspect
#a deployment change's state within the status endpoint, but the status endpoint may not
#block until the deployment change is fully rolled out. The status endpoint is
#continuously called until it responds with completed=true.
###
POST http://localhost:8080/com.steadybit.extension_scaffold.robot.log/status
Content-Type: application/json
{
"state": {
"FormattedMessage": "Hello from R2-D2!"
}
}
###
#### Stop
#The stop step exists to revert system modifications, stop CPU/memory stress or
#any other actions.
#
#The stop step receives the prepare, status or start step's state object. The
#HTTP endpoint must respond with an HTTP status code 200 on success.
###
POST http://localhost:8080/com.steadybit.extension_scaffold.robot.log/stop
Content-Type: application/json
{
"state": {
"FormattedMessage": "Hello from R2-D2!"
}
}
###
#----------------------------------------------------------------------------------
#
#
## Events
#Each time a Steadybit event occurs that matches the listenTo and restrictTo
#configuration, Steadybit will send a request to the endpoint. The request
#will contain the event data.
#
#Refer to the [EventKit documentation](https://github.com/steadybit/event-kit/blob/main/docs/event-api.md) to learn more.
###
POST http://localhost:8080/events/all
Content-Type: application/json
{
"id": "da059724-a8ae-4b4b-b4f0-ee01898232d2",
"eventName": "experiment.execution.created",
"eventTime": "2021-09-01T12:00:00Z",
"tenant": {
"key": "exmpl",
"name": "Example Inc."
},
"principal": {
"principalType": "user",
"username": "tom.mason",
"name": "Tom Mason",
"email": "[email protected]"
},
"environment": {
"id": "STG",
"name": "Staging"
},
"team": {
"key": "ADM",
"name": "Administrators"
},
"experimentExecution": {
"experimentKey": "ADM-4",
"executionId": 34,
"name": "Rollout restart does not impact service availability",
"state": "COMPLETED",
"preparedTime": "2022-11-08T16:42:32.303762Z",
"startedTime": "2022-11-08T16:42:32.329718Z",
"endedTime": "2022-11-08T16:42:42.636157Z"
}
}
###
#----------------------------------------------------------------------------------
#
#
#
#Advice
#Advice allows you to check for common reliability gaps across your infrastructure and
#suggest experiments to your users. Thanks to AdviceKit, you can also author your own
#advice to cover your organization's specific reliability rules.
#
# Our robot extension implements one piece of advice:
#
###
GET http://localhost:8080/advice/robot-maintenance
###
#
#Our robot advice defines via the mandatory `assessmentQueryApplicable` that all
#targets of type robot need to be checked.
#Next, advice can support different advice's lifecycles (see
#[Advice Lifecycle(https://docs.steadybit.com/use-steadybit/explorer/advice#advice-lifecycle)).
#Our robot advice requires action for each target where discovery decides maintenance
#is needed. Furthermore, the advice defines an experiment-based validation for all robots
#that passed the 'action needed' state. If advice assesses that a specific target is
#neither in action needed nor requires validation, it is automatically marked as
#implemented.
#
#Refer to the [AdviceKit documentation](https://github.com/steadybit/advice-kit/blob/main/docs/advice-api.md) to learn more.
#----------------------------------------------------------------------------------
#
#
## Preflights
#Preflights are checks that run before experiment execution to validate whether an experiment
#should be allowed to run based on predefined criteria. Preflights can prevent experiment
#execution if certain conditions are not met, such as being within a maintenance window.
#
#Extensions can contribute custom preflight checks by implementing the PreflightKit interface.
#
#Preflight checks expose endpoints to list available checks, describe them, and execute them.
#The execution follows a lifecycle of start, status, and cancel phases that ensure proper
#validation of conditions before allowing experiment execution.
#
### Preflight List
#The preflight list returns all supported preflight checks provided by the extension.
#This is the entry point for discovering what preflight checks the extension offers.
###
GET http://localhost:8080/
###
#Our extension would return a list of supported preflight checks, for example a
#maintenance window check. The response would include paths to get more details about each check.
#
### Preflight Description
#The preflight description provides metadata about a specific preflight check, including
#what it's called, what it does, and what endpoints to call during execution.
###
GET http://localhost:8080/com.example.preflights.maintenance-window
###
#The description includes properties like id, label, description, and version, as well as
#references to the start, status, and cancel endpoints that implement the preflight check's lifecycle.
#
### Preflight Execution
#Preflight execution follows a three-phase lifecycle: start, status, and cancel.
#
#### Start
#The start phase initiates the preflight check process with information about the experiment
#that is about to be executed.
###
POST http://localhost:8080/com.example.preflights.maintenance-window/start
Content-Type: application/json
{
"preflightActionExecutionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"experimentExecution": {
"id": "4ba85f64-5717-4562-b3fc-2c963f66afa7",
"name": "Check Robot API Resilience",
"description": "This experiment tests the resilience of our robot API"
}
}
###
#### Status
#The status phase checks if the preflight check has completed and whether it was successful.
#For long-running checks, this endpoint will be called repeatedly at the interval specified
#in the preflight description.
###
POST http://localhost:8080/com.example.preflights.maintenance-window/status
Content-Type: application/json
{
"preflightActionExecutionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6"
}
###
#The status response indicates whether the check is completed, and if there's an error that
#should prevent the experiment from running.
#
#### Cancel
#The cancel phase allows cleanup of any resources associated with a preflight check,
#particularly important for long-running checks.
###
POST http://localhost:8080/com.example.preflights.maintenance-window/cancel
Content-Type: application/json
{
"preflightActionExecutionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6"
}
###
#Refer to the [PreflightKit documentation](https://github.com/steadybit/preflight-kit/blob/main/docs/preflight-api.md) to learn more.