Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 163 additions & 6 deletions convex/vt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -566,9 +566,51 @@ export const pollPendingScans = internalAction({
)

if (!aiResult) {
// No Code Insight - trigger a rescan to get it
// No Code Insight - check AV engine stats as fallback
const stats = vtResult.data.attributes.last_analysis_stats
let status: string | null = null
let source = 'engines'

if (stats) {
if (stats.malicious > 0) {
status = 'malicious'
} else if (stats.suspicious > 0) {
status = 'suspicious'
} else if (stats.harmless > 0 || stats.undetected > 0) {
// No detections and some harmless/undetected engines = clean
status = 'clean'
}
Comment on lines +579 to +582
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency with fetchResults "clean" determination

The PR description states this logic "matches the existing logic in fetchResults function," but it does not. The fetchResults function only marks a file clean when harmless > 0:

// fetchResults (line 303) — existing reference implementation
} else if (stats.harmless > 0) {
  status = 'clean'
}

The new code additionally accepts undetected > 0 as sufficient for "clean":

} else if (stats.harmless > 0 || stats.undetected > 0) {

In VirusTotal, undetected means an engine ran but produced no verdict — it did not classify the file as harmless. A file where every engine returns undetected (e.g., 0 harmless, 64 undetected) would be published as "clean" by the polling functions but remain "pending" via fetchResults. This creates a real inconsistency: the stored vtAnalysis.status in the DB would be "clean", but any fresh call to fetchResults for UI display would return "pending" — which could confuse debugging and monitoring.

This same divergence is also present at the corresponding fallback blocks in backfillPendingScans (~line 795), rescanActiveSkills (~line 924), and backfillActiveSkillsVTCache (~line 1258).

If broadening the clean criteria to include undetected > 0 is intentional, fetchResults should be updated to match.

Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/vt.ts
Line: 579-582

Comment:
**Inconsistency with `fetchResults` "clean" determination**

The PR description states this logic "matches the existing logic in `fetchResults` function," but it does not. The `fetchResults` function only marks a file clean when `harmless > 0`:

```typescript
// fetchResults (line 303) — existing reference implementation
} else if (stats.harmless > 0) {
  status = 'clean'
}
```

The new code additionally accepts `undetected > 0` as sufficient for "clean":

```typescript
} else if (stats.harmless > 0 || stats.undetected > 0) {
```

In VirusTotal, `undetected` means an engine ran but produced no verdict — it did not classify the file as harmless. A file where every engine returns `undetected` (e.g., 0 harmless, 64 undetected) would be published as "clean" by the polling functions but remain "pending" via `fetchResults`. This creates a real inconsistency: the stored `vtAnalysis.status` in the DB would be `"clean"`, but any fresh call to `fetchResults` for UI display would return `"pending"` — which could confuse debugging and monitoring.

This same divergence is also present at the corresponding fallback blocks in `backfillPendingScans` (~line 795), `rescanActiveSkills` (~line 924), and `backfillActiveSkillsVTCache` (~line 1258).

If broadening the clean criteria to include `undetected > 0` is intentional, `fetchResults` should be updated to match.

How can I resolve this? If you propose a fix, please make it concise.

}
Comment on lines +574 to +583
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated stats-to-status logic across 4 functions

The same 10-line block for deriving status from last_analysis_stats is copy-pasted identically into pollPendingScans, backfillPendingScans, rescanActiveSkills, and backfillActiveSkillsVTCache. Extracting it into a small helper would reduce duplication and make any future threshold changes (e.g., updating the undetected logic) a single-line fix:

function statusFromEngineStats(
  stats: VTFileResponse['data']['attributes']['last_analysis_stats'],
): string | null {
  if (!stats) return null
  if (stats.malicious > 0) return 'malicious'
  if (stats.suspicious > 0) return 'suspicious'
  if (stats.harmless > 0 || stats.undetected > 0) return 'clean'
  return null
}

This also makes the divergence from fetchResults easier to spot and reason about.

Prompt To Fix With AI
This is a comment left during a code review.
Path: convex/vt.ts
Line: 574-583

Comment:
**Duplicated stats-to-status logic across 4 functions**

The same 10-line block for deriving `status` from `last_analysis_stats` is copy-pasted identically into `pollPendingScans`, `backfillPendingScans`, `rescanActiveSkills`, and `backfillActiveSkillsVTCache`. Extracting it into a small helper would reduce duplication and make any future threshold changes (e.g., updating the `undetected` logic) a single-line fix:

```typescript
function statusFromEngineStats(
  stats: VTFileResponse['data']['attributes']['last_analysis_stats'],
): string | null {
  if (!stats) return null
  if (stats.malicious > 0) return 'malicious'
  if (stats.suspicious > 0) return 'suspicious'
  if (stats.harmless > 0 || stats.undetected > 0) return 'clean'
  return null
}
```

This also makes the divergence from `fetchResults` easier to spot and reason about.

How can I resolve this? If you propose a fix, please make it concise.


if (status) {
// We have a verdict from AV engines - update the skill
console.log(
`[vt:pollPendingScans] Hash ${sha256hash} verdict from AV engines: ${status}`,
)

// Cache VT analysis in version
await ctx.runMutation(internal.skills.updateVersionScanResultsInternal, {
versionId,
vtAnalysis: {
status,
source,
checkedAt: Date.now(),
},
})

// VT finalizes moderation visibility for newly published versions.
await ctx.runMutation(internal.skills.approveSkillByHashInternal, {
sha256hash,
scanner: 'vt',
status,
})
updated++
continue
Comment on lines +607 to +608

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep polling Code Insight after engine-only fallback

This branch treats AV engine stats as a terminal verdict and continues without requesting reanalysis, which means the hash drops out of the pending poll path once vtAnalysis.status is set to clean/suspicious/malicious. getPendingScanSkillsInternal explicitly skips those final VT statuses (convex/skills.ts lines 1878-1884), so we never pick up a later code_insight result for the same version. In cases where Code Insight arrives later with a stricter verdict than the initial engine stats (for example, engines clean but Code Insight suspicious), the skill remains misclassified until the next daily rescan.

Useful? React with 👍 / 👎.

}

// No verdict from engines either - trigger a rescan to get Code Insight
console.log(
`[vt:pollPendingScans] Hash ${sha256hash} has no Code Insight, requesting rescan`,
`[vt:pollPendingScans] Hash ${sha256hash} has no Code Insight or engine stats, requesting rescan`,
)
await requestRescan(apiKey, sha256hash)
// Check if we've exceeded max attempts — write stale vtAnalysis so it
Expand Down Expand Up @@ -741,7 +783,36 @@ export const backfillPendingScans = internalAction({
)

if (!aiResult) {
// No Code Insight - check AV engine stats as fallback
const stats = vtResult.data.attributes.last_analysis_stats
let status: string | null = null

if (stats) {
if (stats.malicious > 0) {
status = 'malicious'
} else if (stats.suspicious > 0) {
status = 'suspicious'
} else if (stats.harmless > 0 || stats.undetected > 0) {
// No detections and some harmless/undetected engines = clean
status = 'clean'
}
}

if (status) {
// We have a verdict from AV engines - update the skill
console.log(`[vt:backfill] Hash ${sha256hash} verdict from AV engines: ${status}`)
await ctx.runMutation(internal.skills.approveSkillByHashInternal, {
sha256hash,
scanner: 'vt',
status,
})
updated++
continue
}

// No verdict from engines either - trigger a rescan
if (triggerRescans) {
console.log(`[vt:backfill] Hash ${sha256hash} has no Code Insight or engine stats, requesting rescan`)
await requestRescan(apiKey, sha256hash)
rescansRequested++
}
Expand Down Expand Up @@ -840,14 +911,67 @@ export const rescanActiveSkills = internalAction({
)

if (!aiResult) {
// No Code Insight - check AV engine stats as fallback
const stats = vtResult.data.attributes.last_analysis_stats
let status: string | null = null
let source = 'engines'

if (stats) {
if (stats.malicious > 0) {
status = 'malicious'
} else if (stats.suspicious > 0) {
status = 'suspicious'
} else if (stats.harmless > 0 || stats.undetected > 0) {
// No detections and some harmless/undetected engines = clean
status = 'clean'
}
}

if (!status) {
// No verdict from engines either - keep as pending
await ctx.runMutation(internal.skills.updateVersionScanResultsInternal, {
versionId,
vtAnalysis: {
status: 'pending',
checkedAt: Date.now(),
},
})
accUnchanged++
continue
}

// We have a verdict from AV engines - continue with normal flow
console.log(`[vt:rescan] ${slug} verdict from AV engines: ${status}`)

await ctx.runMutation(internal.skills.updateVersionScanResultsInternal, {
versionId,
vtAnalysis: {
status: 'pending',
status,
source,
checkedAt: Date.now(),
},
})
accUnchanged++

if (status === 'malicious' || status === 'suspicious') {
console.warn(`[vt:rescan] ${slug}: verdict changed to ${status}!`)
accFlaggedSkills.push({ slug, status })
await ctx.runMutation(internal.skills.escalateByVtInternal, {
sha256hash,
status,
})
accUpdated++
} else if (wasFlagged && status === 'clean') {
// Verdict improved from suspicious → clean: clear the stale moderation flag
console.log(`[vt:rescan] ${slug}: verdict improved to clean, clearing suspicious flag`)
await ctx.runMutation(internal.skills.approveSkillByHashInternal, {
sha256hash,
scanner: 'vt',
status,
})
accUpdated++
} else {
accUnchanged++
}
continue
}

Expand Down Expand Up @@ -1121,8 +1245,41 @@ export const backfillActiveSkillsVTCache = internalAction({
)

if (!aiResult) {
console.log(`[vt:backfillActive] ${slug}: no Code Insight yet`)
noResults++
// No Code Insight - check AV engine stats as fallback
const stats = vtResult.data.attributes.last_analysis_stats
let status: string | null = null
let source = 'engines'

if (stats) {
if (stats.malicious > 0) {
status = 'malicious'
} else if (stats.suspicious > 0) {
status = 'suspicious'
} else if (stats.harmless > 0 || stats.undetected > 0) {
// No detections and some harmless/undetected engines = clean
status = 'clean'
}
}

if (!status) {
console.log(`[vt:backfillActive] ${slug}: no Code Insight or engine stats yet`)
noResults++
continue
}

// We have a verdict from AV engines - update the version
console.log(`[vt:backfillActive] ${slug}: updated with ${status} (from AV engines)`)

await ctx.runMutation(internal.skills.updateVersionScanResultsInternal, {
versionId,
sha256hash,
vtAnalysis: {
status,
source,
checkedAt: Date.now(),
},
})
updated++
continue
}

Expand Down