Skip to content

Prepend UTF-8 BOM to generated CSV files#749

Open
joshuali925-osdbot wants to merge 1 commit into
opensearch-project:mainfrom
joshuali925-osdbot:bot/545-csv-utf8-bom
Open

Prepend UTF-8 BOM to generated CSV files#749
joshuali925-osdbot wants to merge 1 commit into
opensearch-project:mainfrom
joshuali925-osdbot:bot/545-csv-utf8-bom

Conversation

@joshuali925-osdbot

Copy link
Copy Markdown

Description

Prepends a UTF-8 BOM (\uFEFF) to generated CSV files so that Excel and other spreadsheet applications correctly detect the encoding. This fixes the issue where special characters (e.g. ') display as garbled text (’) when opening CSV exports in Excel.

Changes

  • Added UTF-8 BOM prefix in convertToCSV() in dataReportHelpers.ts
  • Added unit tests verifying BOM presence and correct CSV content

Testing

Unit tests pass:

PASS server/routes/utils/__tests__/dataReportHelpers.test.ts
  convertToCSV
    ✓ should prepend UTF-8 BOM to CSV output
    ✓ should contain correct CSV content after BOM

Issues Resolved

Resolves #545

Fixes Excel not detecting UTF-8 encoding for CSV files containing
special characters.

Resolves opensearch-project#545

Signed-off-by: joshuali925-osdbot <278181255+joshuali925-osdbot@users.noreply.github.com>
@joshuali925

Copy link
Copy Markdown
Member

@joshuali925-osdbot can you generate and upload reporting csv files with and without the BOM commit

@joshuali925-osdbot

joshuali925-osdbot commented May 13, 2026

Copy link
Copy Markdown
Author

Here are sample CSV files generated with special characters (curly quotes, accented letters, CJK, symbols):

The with-BOM file starts with EF BB BF (UTF-8 BOM), which tells Excel to use UTF-8 encoding. Without it, Excel defaults to the system locale encoding and garbles special characters.

Sample data used:

name,city,notes
John's Report,São Paulo,€100 — paid
"Jane ""Doe""",Zürich,résumé – Pro™
日本語テスト,北京,中文数据 «test»

To test: open each file in Excel — the without-BOM file will show garbled characters like ’ instead of ', while the with-BOM file displays correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Insert utf-8 BOM in generated csv

4 participants