Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlockNote codeBlock's embedded Shiki significantly increases bundle size #1487

Closed
1 task done
dgd03146 opened this issue Mar 4, 2025 · 5 comments · Fixed by #1519
Closed
1 task done

BlockNote codeBlock's embedded Shiki significantly increases bundle size #1487

dgd03146 opened this issue Mar 4, 2025 · 5 comments · Fixed by #1519
Labels
bug Something isn't working

Comments

@dgd03146
Copy link

dgd03146 commented Mar 4, 2025

Describe the bug
BlockNote with Shiki is causing a significant bundle size issue. Unnecessary language files (like emacs-lisp.mjs) are being included in the client bundle, increasing the bundle size by approximately 788KB (192KB gzipped).

This is occurring in a Turborepo monorepo environment with pnpm as the package manager, where our web workspace uses Next.js 14.

To Reproduce

  1. Create a Next.js 14 project (in our case, within a Turborepo monorepo)
  2. Install BlockNote: npm install @blocknote/core @blocknote/react @blocknote/mantine
  3. Create a basic BlockNote editor with code block support
  4. Run next build and analyze the bundle
  5. Notice large chunks containing Shiki language files that aren't being used

BlockNote version:

"@blocknote/core": "^0.23.6",
"@blocknote/mantine": "^0.23.6",
"@blocknote/react": "^0.23.6",

Code setup:

// BlockNote.tsx
import { BlockNoteSchema, customizeCodeBlock, defaultBlockSpecs } from '@blocknote/core';
import { BlockNoteView, useCreateBlockNote } from '@blocknote/react';

const customCodeBlock = customizeCodeBlock({
  defaultLanguage: 'typescript',
  supportedLanguages: [
    { id: 'javascript', match: ['javascript', 'js'], name: 'JavaScript' },
    { id: 'typescript', match: ['typescript', 'ts'], name: 'TypeScript' },
    { id: 'html', match: ['html'], name: 'HTML' },
    { id: 'css', match: ['css'], name: 'CSS' },
    { id: 'json', match: ['json'], name: 'JSON' },
    { id: 'markdown', match: ['markdown', 'md'], name: 'Markdown' },
  ],
});

const schema = BlockNoteSchema.create({
  blockSpecs: {
    ...defaultBlockSpecs,
    codeBlock: customCodeBlock,
  },
});

export const BlockNote = ({ initialContent }) => {
  const editor = useCreateBlockNote({
    initialContent,
    schema,
  });

  return (
    <BlockNoteView
      editor={editor}
      editable={false}
      formattingToolbar={false}
    />
  );
};

I've tried using webpack externals to exclude unnecessary language files:

// next.config.js
config.externals = [
  ...(config.externals || []),
  function(context, request, callback) {
    if (/node_modules[\/\\]@shikijs[\/\\]langs[\/\\]dist[\/\\](?!javascript|typescript|html|css|json|markdown).+\.mjs$/.test(request)) {
      console.log('Excluded Shiki language file:', request);
      return callback(null, 'commonjs {}');
    }
    callback();
  }
];

But the bundle still includes all Shiki language files:
Image

parsed: static/chunks/b9c69e33.fdaaa1746205dfd1.js (785.91 KB)
gziped: static/chunks/b9c69e33.fdaaa1746205dfd1.js (192.3 KB)

Misc

  • Node version: 22.11
  • Package manager: pnpm
  • Browser: Chrome
  • Environment: Turborepo monorepo with Next.js 14 in web workspace
  • I'm a sponsor and would appreciate if you could look into this sooner than later 💖

Questions

  1. Is there a way to completely remove or replace Shiki in BlockNote's codeBlock?
  2. Is it possible to isolate just the code block rendering to a server component? This would keep Shiki on the server side and prevent it from being included in the client bundle.
@dgd03146 dgd03146 added the bug Something isn't working label Mar 4, 2025
@nperez0111
Copy link
Contributor

Hi @dgd03146 I've been meaning to look into this. To be clear, though, are these language definitions actually being sent to the client? Because from my understanding they should be dynamically imported, so while they may exist in the output bundle, they do not actually get downloaded by the client unless that language is actually used.

If it is being downloaded by the client, then I totally agree this is a major issue that we should address sooner than later.

@dgd03146
Copy link
Author

dgd03146 commented Mar 4, 2025

Hi @nperez0111,

Thanks for looking into this. I've found an interesting discrepancy that suggests these language definitions are indeed included in the client bundle:

  1. Using Next.js bundle analyzer (ANALYZE=true next build), I can see these files in the client bundle:

Image

  1. When I examine the actual bundled code for my BlockNote component, it only shows the 6 languages I specified:

Image

// From the bundled code
let s = (0, i.mT)({
    defaultLanguage: "typescript",
    supportedLanguages: [{
        id: "javascript",
        match: ["javascript", "js"],
        name: "JavaScript"
    }, {
        id: "typescript",
        match: ["typescript", "ts"],
        name: "TypeScript"
    }, {
        id: "html",
        match: ["html"],
        name: "HTML"
    }, {
        id: "css",
        match: ["css"],
        name: "CSS"
    }, {
        id: "json",
        match: ["json"],
        name: "JSON"
    }, {
        id: "markdown",
        match: ["markdown", "md"],
        name: "Markdown"
    }]
})

To understand this discrepancy, I looked at @shikijs/langs/dist/index.mjs and found a complete list of all supported languages:

export const languageNames = [
  "abap",
  "actionscript-3",
  "ada",
  // ... over 200 languages listed
  "yaml",
  "zenscript",
  "zig"
]

I suspect that somewhere in the Shiki or BlockNote codebase, this list is being used to dynamically import language files, causing webpack to include ALL language files in the bundle, regardless of the supportedLanguages configuration I've set.

This explains the discrepancy: my component code correctly specifies only 6 languages, but the internal implementation is likely referencing all possible languages from the complete list.

I've tried various webpack configurations to exclude these files, but none have been successful:

// next.config.js
config.externals = [
  ...(config.externals || []),
  function(context, request, callback) {
    if (/node_modules[\/\\]@shikijs[\/\\]langs[\/\\]dist[\/\\](?!javascript|typescript|html|css|json|markdown).+\.mjs$/.test(request)) {
      console.log('Excluded Shiki language file:', request);
      return callback(null, 'commonjs {}');
    }
    callback();
  }
];

The bundle analyzer shows large files like emacs-lisp.mjs (785.91 KB / 192.3 KB gzipped) included in the bundle, which significantly increases the overall bundle size.

Is there a way to modify BlockNote to truly only include the languages specified in supportedLanguages?

I'm happy to help test any potential solutions or provide more information if needed.

@dgd03146
Copy link
Author

dgd03146 commented Mar 4, 2025

I've examined the actual content of emacs-lisp.mjs and found that it contains a massive JSON object with language definitions:

Image

I noticed this file is quite large (785.91 KB uncompressed / 192.3 KB gzipped). Looking at the bundle analyzer, it appears that all language files are being included in the bundle, even though I've specified only a few languages in my supportedLanguages configuration.

I believe this might be happening because of how the language files are imported. The complete language list in index.mjs combined with a dynamic import pattern could be causing webpack to include all language files in the bundle.

This significantly increases the bundle size for my application, where we only need a handful of languages. For our users on mobile devices or slower connections, this extra code could impact loading performance.

Would it be possible to enhance BlockNote to only include the languages specified in supportedLanguages? I'd be happy to help test any potential solutions or contribute to implementing this improvement if guidance is provided.

I'm genuinely interested in contributing to this project if there's an opportunity. I would love to help make BlockNote even better for everyone. Please let me know if there's a specific approach you'd recommend or if you'd like me to explore some potential solutions.

@nperez0111
Copy link
Contributor

Thanks for all the information, it would be great to have you contribute to this. But, before we get to resolving the issue I need to confirm what exactly the issue is first.

I understand that it may be bundling all the possible languages in your distribution folder, but my main concern is on what the client actually downloads. That is what would tell me the severity of this issue. If possible, could you please confirm for me whether the client when using your application actually downloads this emacs-lisp file for example, you can probably search the network panel for a unique string to see if it matches.

@dgd03146
Copy link
Author

dgd03146 commented Mar 7, 2025

Thanks for your response!

I’ve checked, and the emacs-lisp file isn’t being downloaded on the client side; it’s dynamically loaded when needed. However, the issue seems to be that it's still being included in the bundle during the build process, which results in unnecessarily large build times.

Do you have any suggestions on how we can remove or exclude specific language files, like emacs-lisp, from the final bundle? I believe it might be due to how shiki bundles all available languages by default.

Any guidance on how to prevent this file from being bundled would be greatly appreciated.

Thanks in advance for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants