File: action/src/capture/file-fetcher.ts lines 28-48
Summary: fetchFileContent base64-decodes every file returned by repos.getContent as UTF-8, but GitHub returns encoding='base64' for ALL files under 1MB (text and binary). The encoding !== 'base64' guard only filters files >=1MB, not binaries. Any PR touching a small binary (PNG, font, zip, sqlite, etc.) ends up with mojibake (U+FFFD replacement chars) stored in content_blobs and fed through computeLineDiff. The isBinaryFile() helper in the same file (lines 53-66) exists but is never called.
Fix direction: call isBinaryFile(path) at top of fetchFileContent.
File:
action/src/capture/file-fetcher.tslines 28-48Summary:
fetchFileContentbase64-decodes every file returned byrepos.getContentas UTF-8, but GitHub returnsencoding='base64'for ALL files under 1MB (text and binary). Theencoding !== 'base64'guard only filters files >=1MB, not binaries. Any PR touching a small binary (PNG, font, zip, sqlite, etc.) ends up with mojibake (U+FFFD replacement chars) stored incontent_blobsand fed throughcomputeLineDiff. TheisBinaryFile()helper in the same file (lines 53-66) exists but is never called.Fix direction: call
isBinaryFile(path)at top offetchFileContent.