Skip to content

Commit 97fe24c

Browse files
mayakostbgagent
andauthored
fix(jira): reactive token refresh + retry on 401 feedback-comment POST (#370) (#375)
* fix(jira): reactive token refresh + retry on 401 feedback-comment POST (#370) The Jira webhook processor's feedback-comment POST could 401 even though the gateway base (api.atlassian.com/ex/jira/<cloudId>) and proactive, expiry-based token refresh were already in place. Root cause: refresh was *proactive only*. A token can be rejected with 401/403 before its stored `expires_at` — server-side revocation, a re-issued token after a scope change, or a value still inside the 60s in-memory cache after an out-of-band rotation. In those cases `postComment` logged "Jira feedback REST non-2xx", status: 401 and gave up, so every upstream failure (REPO_NOT_ONBOARDED, project not onboarded, user not linked) became silent to the operator — the feedback comment is the only operator-visible signal. Fix: - `resolveJiraOauthToken` gains a `forceRefresh` option that bypasses both the in-memory token cache and the proactive-expiry short-circuit, minting a guaranteed-fresh token regardless of `expires_at`. - `postComment` now reports an outcome ('ok' | 'auth' | 'error') so the caller can distinguish a recoverable auth rejection from a terminal one. - `postIssueComment` reacts to a 401/403 with exactly one forced refresh + one retry. The retry is bounded (a second 401 means the credential is genuinely unusable), and is skipped when the refresh returns an unchanged token (refresh-token revoked → retry would only 401 again). Best-effort/never-throws semantics are preserved. Adds coverage for the forceRefresh resolver path and the 401→refresh→retry feedback path (success, 403, unchanged-token, refresh-fails, second-401). * test(jira): cover resolveTenantToken catch path (codecov #375) Codecov flagged 5 uncovered lines on PR #375 — the catch block in resolveTenantToken (jira-feedback.ts), which the existing tests never exercised (they only made the resolver resolve to a value or null, never throw). Add two tests that make resolveJiraOauthToken throw: once on the initial resolve and once on the forced-refresh retry after a 401, covering both the force_refresh:false and force_refresh:true log branches. Brings jira-feedback.ts to 100% line coverage. --------- Co-authored-by: bgagent <bgagent@noreply.github.com>
1 parent 0e2806a commit 97fe24c

4 files changed

Lines changed: 334 additions & 14 deletions

File tree

cdk/src/handlers/shared/jira-feedback.ts

Lines changed: 59 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -66,12 +66,19 @@ function toAdfDocument(message: string): Record<string, unknown> {
6666
};
6767
}
6868

69+
/**
70+
* Outcome of a single comment POST. We distinguish auth rejection (401/403)
71+
* from other failures so the caller can react to the former with a forced
72+
* token refresh + retry, and treat the latter as terminal.
73+
*/
74+
type PostOutcome = 'ok' | 'auth' | 'error';
75+
6976
async function postComment(
7077
accessToken: string,
7178
cloudId: string,
7279
issueIdOrKey: string,
7380
message: string,
74-
): Promise<boolean> {
81+
): Promise<PostOutcome> {
7582
// The 3LO token (audience=api.atlassian.com) is only valid against the
7683
// gateway base scoped by cloudId — see JIRA_API_BASE. Posting to the raw
7784
// site host (`*.atlassian.net`) would 401. Both path segments are
@@ -92,17 +99,20 @@ async function postComment(
9299
body: JSON.stringify({ body: toAdfDocument(message) }),
93100
signal: controller.signal,
94101
});
95-
if (!resp.ok) {
96-
logger.warn('Jira feedback REST non-2xx', { status: resp.status, url });
97-
return false;
98-
}
99-
return true;
102+
if (resp.ok) return 'ok';
103+
// 401/403 are recoverable: the stored access token may be dead despite a
104+
// not-yet-reached `expires_at` (server-side revocation, scope re-issue, or
105+
// a value cached past its out-of-band rotation). Signal the caller to
106+
// force-refresh and retry once. Any other non-2xx is terminal.
107+
const outcome: PostOutcome = resp.status === 401 || resp.status === 403 ? 'auth' : 'error';
108+
logger.warn('Jira feedback REST non-2xx', { status: resp.status, url, outcome });
109+
return outcome;
100110
} catch (err) {
101111
logger.warn('Jira feedback request failed', {
102112
error: err instanceof Error ? err.message : String(err),
103113
url,
104114
});
105-
return false;
115+
return 'error';
106116
} finally {
107117
clearTimeout(timer);
108118
}
@@ -123,14 +133,20 @@ export interface JiraFeedbackContext {
123133

124134
async function resolveTenantToken(
125135
ctx: JiraFeedbackContext,
136+
forceRefresh = false,
126137
): Promise<{ accessToken: string } | null> {
127138
try {
128-
const resolved = await resolveJiraOauthToken(ctx.cloudId, ctx.registryTableName);
139+
const resolved = await resolveJiraOauthToken(ctx.cloudId, ctx.registryTableName, { forceRefresh });
129140
if (!resolved) return null;
130141
return { accessToken: resolved.accessToken };
131142
} catch (err) {
143+
// `force_refresh` discriminates the initial resolve from the post-401
144+
// retry resolve: a failure here on the retry is an infra error (DDB/SM),
145+
// distinct from "refresh-token revoked", and triage needs to tell them
146+
// apart.
132147
logger.warn('Jira feedback could not resolve OAuth token', {
133148
jira_cloud_id: ctx.cloudId,
149+
force_refresh: forceRefresh,
134150
error: err instanceof Error ? err.message : String(err),
135151
});
136152
return null;
@@ -141,6 +157,17 @@ async function resolveTenantToken(
141157
* Post a comment onto a Jira issue. Returns true on success, false on any
142158
* failure (network, auth, REST errors). Never throws — callers proceed
143159
* regardless.
160+
*
161+
* Auth resilience: the access token from the resolver can already be stale
162+
* (cached within its TTL, or revoked/re-issued server-side before its
163+
* advertised `expires_at`). The proactive expiry check can't catch those, so
164+
* a 401/403 on the first POST triggers exactly one forced token refresh
165+
* (`forceRefresh: true`) and one retry. This is the path that makes feedback
166+
* comments — the only operator-visible failure signal — actually land after a
167+
* token goes bad, rather than silently 401ing (issue #370). The retry is
168+
* bounded at one attempt: a second 401 means the credential is genuinely
169+
* unusable (refresh-token revoked, scope removed), so we stop and let the
170+
* caller no-op.
144171
*/
145172
export async function postIssueComment(
146173
ctx: JiraFeedbackContext,
@@ -149,7 +176,30 @@ export async function postIssueComment(
149176
): Promise<boolean> {
150177
const resolved = await resolveTenantToken(ctx);
151178
if (!resolved) return false;
152-
return postComment(resolved.accessToken, ctx.cloudId, issueIdOrKey, body);
179+
180+
const outcome = await postComment(resolved.accessToken, ctx.cloudId, issueIdOrKey, body);
181+
if (outcome === 'ok') return true;
182+
if (outcome === 'error') return false;
183+
184+
// outcome === 'auth': the stored access token was rejected. Force a refresh
185+
// (bypassing the resolver's cache and proactive-expiry short-circuit) and
186+
// retry once with the freshly-minted token.
187+
logger.info('Jira feedback got auth rejection — forcing token refresh and retrying once', {
188+
jira_cloud_id: ctx.cloudId,
189+
issue_id_or_key: issueIdOrKey,
190+
});
191+
const refreshed = await resolveTenantToken(ctx, true);
192+
if (!refreshed) return false;
193+
// If the refresh handed back the same access token, the retry can only
194+
// reproduce the 401 — skip the redundant network call.
195+
if (refreshed.accessToken === resolved.accessToken) {
196+
logger.warn('Jira feedback refresh returned an unchanged token — not retrying', {
197+
jira_cloud_id: ctx.cloudId,
198+
issue_id_or_key: issueIdOrKey,
199+
});
200+
return false;
201+
}
202+
return (await postComment(refreshed.accessToken, ctx.cloudId, issueIdOrKey, body)) === 'ok';
153203
}
154204

155205
/**

cdk/src/handlers/shared/jira-oauth-resolver.ts

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,21 @@ export interface ResolverOptions {
9797
readonly dynamoDbClient?: DynamoDBDocumentClient;
9898
/** Override fetch for token-endpoint refresh in tests. */
9999
readonly fetchImpl?: typeof fetch;
100+
/**
101+
* Force a token refresh even when the stored `expires_at` claims the token
102+
* is still valid. Used by the reactive-401 retry path: Atlassian can reject
103+
* an access token (401) before its advertised expiry — server-side
104+
* invalidation, a re-issued token after a scope change, or a token cached
105+
* within {@link SECRET_CACHE_TTL_MS} that was rotated out-of-band. The
106+
* proactive expiry check can't see those, so on a 401 the caller re-resolves
107+
* with `forceRefresh: true` to bypass the in-memory *token* cache and the
108+
* expiry short-circuit and mint a guaranteed-fresh token before retrying.
109+
*
110+
* The registry-row cache is intentionally NOT bypassed: a revoked access
111+
* token does not change the row (`site_url`, `oauth_secret_arn`), so forcing
112+
* a DDB read on every 401 would add load without changing the outcome.
113+
*/
114+
readonly forceRefresh?: boolean;
100115
}
101116

102117
interface CacheEntry<T> {
@@ -137,6 +152,10 @@ export interface ResolvedJiraToken {
137152
* Refreshes silently if the cached token is expiring. Returns null on any
138153
* failure (registry miss, secret missing, refresh-token revoked) so callers
139154
* can gracefully no-op rather than blowing up.
155+
*
156+
* Pass `options.forceRefresh` to mint a guaranteed-fresh token even when the
157+
* stored `expires_at` looks valid — see {@link ResolverOptions.forceRefresh}
158+
* (the reactive-401 retry path).
140159
*/
141160
export async function resolveJiraOauthToken(
142161
cloudId: string,
@@ -146,6 +165,7 @@ export async function resolveJiraOauthToken(
146165
const region = options.region ?? process.env.AWS_REGION ?? 'us-east-1';
147166
const ddb = options.dynamoDbClient ?? DynamoDBDocumentClient.from(new DynamoDBClient({ region }));
148167
const sm = options.secretsManagerClient ?? new SecretsManagerClient({ region });
168+
const forceRefresh = options.forceRefresh ?? false;
149169

150170
// ─── Step 1: Registry row ────────────────────────────────────────
151171
const row = await getRegistryRow(ddb, registryTableName, cloudId);
@@ -162,7 +182,11 @@ export async function resolveJiraOauthToken(
162182
}
163183

164184
// ─── Step 2: Cached or fresh token JSON ──────────────────────────
165-
const cached = tokenCache.get(row.oauth_secret_arn);
185+
// `forceRefresh` bypasses the in-memory token cache: the cached access token
186+
// is the one a caller just saw 401, so we must re-read the secret (and
187+
// refresh below) rather than hand the same dead token back. (The registry
188+
// row above still uses its own cache — see ResolverOptions.forceRefresh.)
189+
const cached = forceRefresh ? undefined : tokenCache.get(row.oauth_secret_arn);
166190
let token: StoredOauthToken;
167191
if (cached && cached.expiresAt > Date.now() && !isTokenExpiring(cached.value.expires_at)) {
168192
token = cached.value;
@@ -178,8 +202,11 @@ export async function resolveJiraOauthToken(
178202
token = fetched;
179203
}
180204

181-
// ─── Step 3: Refresh if expiring ─────────────────────────────────
182-
if (isTokenExpiring(token.expires_at)) {
205+
// ─── Step 3: Refresh if expiring (or forced) ─────────────────────
206+
// `forceRefresh` refreshes regardless of `expires_at`: a 401 on a
207+
// not-yet-expired token means the stored access token is dead despite the
208+
// timestamp, so trust the 401 over the clock and rotate.
209+
if (forceRefresh || isTokenExpiring(token.expires_at)) {
183210
const refreshed = await refreshJiraToken(token, sm, row.oauth_secret_arn, options);
184211
if (!refreshed) {
185212
return null;

cdk/test/handlers/shared/jira-feedback.test.ts

Lines changed: 138 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,12 +103,48 @@ describe('jira-feedback: postIssueComment', () => {
103103
expect(fetchMock).not.toHaveBeenCalled();
104104
});
105105

106-
test('returns false on a non-2xx response and swallows the error', async () => {
107-
global.fetch = jest.fn().mockResolvedValue(mockResponse(401)) as unknown as typeof fetch;
106+
test('returns false (never throws) when the initial resolve throws an infra error', async () => {
107+
// The resolver can throw on DDB/SM failure (not just resolve null). The
108+
// catch in resolveTenantToken must swallow it and no-op.
109+
resolveJiraOauthTokenMock.mockRejectedValueOnce(new Error('DDB throttle'));
110+
const fetchMock = jest.fn();
111+
global.fetch = fetchMock as unknown as typeof fetch;
112+
113+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
114+
115+
expect(ok).toBe(false);
116+
expect(fetchMock).not.toHaveBeenCalled();
117+
});
118+
119+
test('returns false when the forced-refresh resolve throws an infra error after a 401', async () => {
120+
// First resolve succeeds; the POST 401s; the forced-refresh resolve then
121+
// throws (SM unavailable). Must swallow and no-op — never throw.
122+
resolveJiraOauthTokenMock
123+
.mockResolvedValueOnce({ accessToken: 'stale_token', scope: '', siteUrl: '', oauthSecretArn: 'x' })
124+
.mockRejectedValueOnce(new Error('SecretsManager unavailable'));
125+
const fetchMock = jest.fn().mockResolvedValueOnce(mockResponse(401));
126+
global.fetch = fetchMock as unknown as typeof fetch;
127+
128+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
129+
130+
expect(ok).toBe(false);
131+
// Only the first POST happened; the retry never got a token.
132+
expect(fetchMock).toHaveBeenCalledTimes(1);
133+
expect(resolveJiraOauthTokenMock).toHaveBeenCalledTimes(2);
134+
// The forced-refresh resolve carried forceRefresh: true.
135+
expect(resolveJiraOauthTokenMock.mock.calls[1][2]).toEqual({ forceRefresh: true });
136+
});
137+
138+
test('returns false on a non-auth non-2xx response and does NOT refresh', async () => {
139+
const fetchMock = jest.fn().mockResolvedValue(mockResponse(500));
140+
global.fetch = fetchMock as unknown as typeof fetch;
108141

109142
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
110143

111144
expect(ok).toBe(false);
145+
// 5xx is terminal — no forced-refresh retry, no second POST.
146+
expect(fetchMock).toHaveBeenCalledTimes(1);
147+
expect(resolveJiraOauthTokenMock).toHaveBeenCalledTimes(1);
112148
});
113149

114150
test('reportIssueFailure never throws even when fetch rejects', async () => {
@@ -119,3 +155,103 @@ describe('jira-feedback: postIssueComment', () => {
119155
await expect(reportIssueFailure(CTX, 'ENG-42', '❌ nope')).resolves.toBeUndefined();
120156
});
121157
});
158+
159+
describe('jira-feedback: 401 → forced refresh → retry (issue #370)', () => {
160+
test('refreshes the token and retries once on 401, succeeding the second time', async () => {
161+
// First resolve hands back a (stale) token; the forced-refresh resolve
162+
// hands back a different, freshly-minted token.
163+
resolveJiraOauthTokenMock
164+
.mockResolvedValueOnce({
165+
accessToken: 'stale_token',
166+
scope: 'write:jira-work',
167+
siteUrl: 'https://acme.atlassian.net',
168+
oauthSecretArn: 'arn:aws:secretsmanager:us-east-1:111:secret:x',
169+
})
170+
.mockResolvedValueOnce({
171+
accessToken: 'fresh_token',
172+
scope: 'write:jira-work',
173+
siteUrl: 'https://acme.atlassian.net',
174+
oauthSecretArn: 'arn:aws:secretsmanager:us-east-1:111:secret:x',
175+
});
176+
const fetchMock = jest
177+
.fn()
178+
.mockResolvedValueOnce(mockResponse(401)) // first POST: stale token rejected
179+
.mockResolvedValueOnce(mockResponse(201)); // retry: fresh token accepted
180+
global.fetch = fetchMock as unknown as typeof fetch;
181+
182+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
183+
184+
expect(ok).toBe(true);
185+
// Two POSTs, and the second carried the refreshed bearer token.
186+
expect(fetchMock).toHaveBeenCalledTimes(2);
187+
const firstHeaders = (fetchMock.mock.calls[0][1] as RequestInit).headers as Record<string, string>;
188+
const secondHeaders = (fetchMock.mock.calls[1][1] as RequestInit).headers as Record<string, string>;
189+
expect(firstHeaders.Authorization).toBe('Bearer stale_token');
190+
expect(secondHeaders.Authorization).toBe('Bearer fresh_token');
191+
// The forced-refresh resolve must pass forceRefresh: true.
192+
expect(resolveJiraOauthTokenMock).toHaveBeenCalledTimes(2);
193+
expect(resolveJiraOauthTokenMock.mock.calls[0][2]).toEqual({ forceRefresh: false });
194+
expect(resolveJiraOauthTokenMock.mock.calls[1][2]).toEqual({ forceRefresh: true });
195+
});
196+
197+
test('also retries on 403 (insufficient-permission style auth rejection)', async () => {
198+
resolveJiraOauthTokenMock
199+
.mockResolvedValueOnce({ accessToken: 'stale_token', scope: '', siteUrl: '', oauthSecretArn: 'x' })
200+
.mockResolvedValueOnce({ accessToken: 'fresh_token', scope: '', siteUrl: '', oauthSecretArn: 'x' });
201+
const fetchMock = jest
202+
.fn()
203+
.mockResolvedValueOnce(mockResponse(403))
204+
.mockResolvedValueOnce(mockResponse(201));
205+
global.fetch = fetchMock as unknown as typeof fetch;
206+
207+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
208+
209+
expect(ok).toBe(true);
210+
expect(fetchMock).toHaveBeenCalledTimes(2);
211+
});
212+
213+
test('returns false without a second POST when the refresh yields the same token', async () => {
214+
// Both resolves return the identical access token (e.g. refresh-token
215+
// revoked, so the resolver couldn't rotate). Retrying would only 401
216+
// again, so we skip it.
217+
const fetchMock = jest.fn().mockResolvedValue(mockResponse(401));
218+
global.fetch = fetchMock as unknown as typeof fetch;
219+
220+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
221+
222+
expect(ok).toBe(false);
223+
expect(fetchMock).toHaveBeenCalledTimes(1); // no retry with an unchanged token
224+
expect(resolveJiraOauthTokenMock).toHaveBeenCalledTimes(2);
225+
});
226+
227+
test('returns false when the forced refresh cannot resolve a token', async () => {
228+
resolveJiraOauthTokenMock
229+
.mockResolvedValueOnce({ accessToken: 'stale_token', scope: '', siteUrl: '', oauthSecretArn: 'x' })
230+
.mockResolvedValueOnce(null); // refresh-token revoked → resolver gives up
231+
const fetchMock = jest.fn().mockResolvedValueOnce(mockResponse(401));
232+
global.fetch = fetchMock as unknown as typeof fetch;
233+
234+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
235+
236+
expect(ok).toBe(false);
237+
expect(fetchMock).toHaveBeenCalledTimes(1);
238+
});
239+
240+
test('does not retry a second time if the refreshed token is also rejected', async () => {
241+
resolveJiraOauthTokenMock
242+
.mockResolvedValueOnce({ accessToken: 'stale_token', scope: '', siteUrl: '', oauthSecretArn: 'x' })
243+
.mockResolvedValueOnce({ accessToken: 'fresh_token', scope: '', siteUrl: '', oauthSecretArn: 'x' });
244+
const fetchMock = jest
245+
.fn()
246+
.mockResolvedValueOnce(mockResponse(401)) // stale rejected
247+
.mockResolvedValueOnce(mockResponse(401)); // fresh also rejected → give up
248+
global.fetch = fetchMock as unknown as typeof fetch;
249+
250+
const ok = await postIssueComment(CTX, 'ENG-42', 'hello');
251+
252+
expect(ok).toBe(false);
253+
// Exactly two POSTs — the retry is bounded at one attempt.
254+
expect(fetchMock).toHaveBeenCalledTimes(2);
255+
expect(resolveJiraOauthTokenMock).toHaveBeenCalledTimes(2);
256+
});
257+
});

0 commit comments

Comments
 (0)