Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch Highcharts from CDN during docker build #415

Open
danielbecroft opened this issue Sep 4, 2023 · 14 comments
Open

Fetch Highcharts from CDN during docker build #415

danielbecroft opened this issue Sep 4, 2023 · 14 comments

Comments

@danielbecroft
Copy link

I'm currently experimenting with the enhancement/puppeteer branch, trying to move from a Windows installation to running the export server inside a container.

I've started with something basic, based on other issues that have been reported

FROM node:alpine

WORKDIR /home/highchart-export-server 

ENV ACCEPT_HIGHCHARTS_LICENSE=1
ENV HIGHCHARTS_VERSION=11.1.0
ENV HIGHCHARTS_USE_MAPS=0
ENV HIGHCHARTS_USE_GANTT=0
ENV HIGHCHARTS_CDN=npm


RUN apk update \
    && apk upgrade \
    && apk add --no-cache git patch \
    && rm -rf /var/lib/apt/lists/*

RUN npm install [email protected]

EXPOSE 7801

ENTRYPOINT ["node", "node_modules/highcharts-export-server/bin/cli.js", "--enableServer", "1", "--port", "7801", "--logLevel", "3"]

The issue I have is that running the export server on container startup initiates the script download from the CDN. To avoid this in production, and ensure we have exactly the same container running in each instance, what's the best approach to get this information downloaed and baked into the container image?

I've tried runing the cli.js directly in the docker build command to no avail..

Is it possible to pre-fetch the scripts etc during the build phase, or is it limited to startup of the server itself?

@jszuminski
Copy link
Contributor

Thanks for reporting!

If I understand you correctly, you would like there to be an equivalent to the node build.js which was available in the Phantom-based version of the Export Server.

We got rid of the build.js file and the separate building process, but after all the prioritized issues related to server health are taken care of, we'll consider adding back this option.

One last thing: could you please explain why do you need to fetch all the scripts in a different process? Do you plan to run this node ./bin/cli.js --enableServer 1 ... often? If you plan to run it only once, isn't the node build.js unnecessary?

@davidseverwright
Copy link

I currently start the export server every hour to send some reports, and then stop it again. The files are re-downloaded every time. The frequency of the container starting is irrelevant though, it's not about the bandwidth.

  • It adds an additional, and unnecessary, point of failure. What if you change or delete those files? Or there's a network issue? The whole point of containerising things is to remove these external variables and get guaranteed consistency.
  • It also requires internet access. The export server shouldn't require internet access and having to open a hole in the firewall just to repeatedly download the same file is silly. It's also about to become expensive on AWS when they start charging for public IPs

@jszuminski
Copy link
Contributor

Agree absolutely, thanks for your thorough explanation!

We'll definitely add it to our backlog. I'll keep you posted here.

@danielbecroft
Copy link
Author

Thanks @jakubSzuminski , my thoughts are the same as @davidseverwright: undesirable firewall changes in a production environment, reproducability of builds and deployments, and dependency on an external resource for container startup.

@cvasseng
Copy link
Contributor

cvasseng commented Feb 7, 2024

Revisiting this: We can't bundle the actual library due to licensing, and because different users need different versions of them (some may only have a license for V9 for instance, or some may want to lock it to a particular version for one reason or another).

That said, I understand your use case specifically, and why this poses a challenge there.

There are a couple of potential solutions we could implement fairly quickly:

  1. We could allow for overriding the CDN URL, so that the files could be hosted somewhere else (e.g. on an internal CDN)

  2. We could add config that allows for loading the library cache from the filesystem instead of through a CDN, along with a simple bake tool (as part of the CLI for instance) that does the current CDN fetch to an arbitrary file system location specified as input. You could then have a pre-fetched cache stored alongside your dockerfile, extract it into the container in the docker file, and add a configuration flag to the export startup flag to point to the extracted cache.

Would either approach be suitable for you?

It's also possible that we could add Highcharts as an optional peer dependency, so that you could install Highcharts itself through NPM in your dockerfile and lock that to the version you require. However, we'd need to do some testing to confirm that approach (which IMO is arguably the better of the three - though it would take some more time to get it up and running provided it's a feasible approach in general).

@danielbecroft
Copy link
Author

Hi @cvasseng , I think either approach would work fine for our scenario (option 2, or the npm peer dependency would be the preferred). Our approach for option 2 would be using a multi-stage build. Something like (untested):

FROM node AS base
ENV ACCEPT_HIGHCHARTS_LICENSE=1
ENV HIGHCHARTS_VERSION=11.1.0
ENV HIGHCHARTS_USE_MAPS=0
ENV HIGHCHARTS_USE_GANTT=0

RUN npm install [email protected]

FROM base AS installer
RUN node node_modules/highcharts-export-server/bin/cli.js --download --cache-dir ./cache

FROM base
COPY --from=installer ./cache ./cache
ENTRYPOINT .....

@noxify
Copy link

noxify commented Mar 8, 2024

It's also possible that we could add Highcharts as an optional peer dependency, so that you could install Highcharts itself through NPM in your dockerfile and lock that to the version you require.

I have used this approach together with a custom CDN.

In our case, we use the built-in express instance to create a new route to simulate the "CDN".
The endpoint reads the files from the node_modules/highcharts.

// package.json
{
  "name": "chart-exporter",
  "type": "module",
  "scripts": {
    "dev": "node --watch src/server.js",
    "format": "prettier --check .",
    "format:fix": "npm run format -- --write",
    "lint": "eslint .",
    "lint:fix": "npm run lint -- --fix"
  },
  "dependencies": {
    "highcharts": "11.4.0",
    "highcharts-export-server": "3.1.1"
  },
  "devDependencies": {
    "@ianvs/prettier-plugin-sort-imports": "4.1.1",
    "@types/node": "^20.11.25",
    "eslint": "8.57.0",
    "eslint-plugin-import": "2.29.1",
    "prettier": "3.2.5"
  }
}
// src/server.js

import { readFileSync } from "fs"
import path from "path"
import exporter from "highcharts-export-server"

// https://github.com/highcharts/node-export-server?tab=readme-ov-file#default-json-config
const config = {
  puppeteer: {
    args: [],
  },
  highcharts: {
    version: "11.3.0",
    cdnURL: "http://localhost:8080/cdn/",
    forceFetch: false,
    coreScripts: ["highcharts"],
    modules: [
      "parallel-coordinates",
      "data",
      "static-scale",
      "broken-axis",
      "item-series",
      "pattern-fill",
      "series-label",
      "no-data-to-display",
    ],
    indicators: [],
    scripts: [],
  },
  export: {
    // your export options
  },
  customCode: {
    allowCodeExecution: false,
    allowFileResources: true,
    customCode: false,
    callback: false,
    resources: false,
    loadConfig: false,
    createConfig: false,
  },
  server: {
    // ... server config
  },
  pool: {
   // ... pool config
  },
  logging: {
    level: 2,
    file: "highcharts-export-server.log",
    dest: "log/",
  },
  ui: {
    enable: true,
    route: "/",
  },
  other: {
    noLogo: true,
  },
}

const main = async () => {
  exporter.server.get("/cdn/:version/:filename", (req, res) => {
    const filePath = path.join(
      path.resolve(),
      "node_modules/highcharts/",
      req.params.filename,
    )

    res.status(200).send(readFileSync(filePath))
  })

  // some modules are inside the `modules` directory
  // haven't found a way to solve this in one route
  exporter.server.get("/cdn/:version/modules/:filename", (req, res) => {
    const filePath = path.join(
      path.resolve(),
      "node_modules/highcharts/modules/",
      req.params.filename,
    )

    res.status(200).send(readFileSync(filePath))
  })

  exporter.setOptions(config, [])

  // we have to start the server before we initialize the pool
  // otherwise the local CDN endpoint isn't available 
  await exporter.startServer(config.server)

  await exporter.initPool(config)
}

void main()

We use this Dockerfile:


FROM node:20-alpine
ENV ACCEPT_HIGHCHARTS_LICENSE="YES"
ENV HIGHCHARTS_VERSION="11.3.0"
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser

USER root
WORKDIR /app
COPY . .
RUN rm -rf node_modules/ \
    && rm -rf log/ \
    && rm -rf tmp/
RUN apk add --no-cache chromium nss freetype harfbuzz ca-certificates ttf-freefont dbus
RUN npm ci
RUN mkdir /var/run/dbus/ \
    && chmod -R 777 /var/run/dbus/ 

RUN chgrp -R 0 /app && \
    chmod -R g=u /app

EXPOSE 8080

USER 1000
HEALTHCHECK CMD /bin/true

CMD ["node", "src/server.js"]

@erlichmen
Copy link

I wrote a small hack to handle this for now, a simple script that I run during docker build.
The script is under ./scripts/preapreCache.js and in the docker file I run:

RUN node ./scripts/preapreCache.js

const fs = require("fs");
import("../node_modules/highcharts-export-server/lib/cache.js").then(({ checkCache }) => {
  const config = JSON.parse(fs.readFileSync("config.json").toString());
  checkCache(config.highcharts).catch((err) => {
    console.error(err);
  });
});

@bamarch
Copy link

bamarch commented May 16, 2024

Another workaround is to start the server as part of the Docker build process and wait for the files to be downloaded into the cache

# build / install / configire the server here

RUN ./scripts/runAndStopServerToPopulateCache.sh

# define ENTRYPOINT or CMD here
#!/bin/sh
highcharts-export-server --enableServer "1" &
pid=$!
until test -f /path/to/manifest.json && test -f /path/to/sources.js; do
  sleep 1
done
kill $pid

Has a slight advantage since the CLI interface is more stable than the JS internals

@level420
Copy link
Contributor

level420 commented Jul 31, 2024

@bamarch I've tried your workaround, but the previous puppeter run during image build leaves the chromium profile locked. See:

Wed Jul 31 2024 16:24:45 GMT+0000 [error] - [browser] Failed to launch a browser instance. 
 Error: Failed to launch the browser process! undefined
[16:16:0731/162445.179268:ERROR:process_singleton_posix.cc(353)] The profile appears 
to be in use by another Chromium process (20) on another computer (buildkitsandbox). 
Chromium has locked the profile so that it doesn't get corrupted. If you are sure no other 
processes are using this profile, you can unlock the profile and relaunch Chromium.

Die you find a solution to this?

@level420
Copy link
Contributor

level420 commented Aug 1, 2024

@bamarch I've tried your workaround, but the previous puppeter run during image build leaves the chromium profile locked. See:

Wed Jul 31 2024 16:24:45 GMT+0000 [error] - [browser] Failed to launch a browser instance. 
 Error: Failed to launch the browser process! undefined
[16:16:0731/162445.179268:ERROR:process_singleton_posix.cc(353)] The profile appears 
to be in use by another Chromium process (20) on another computer (buildkitsandbox). 
Chromium has locked the profile so that it doesn't get corrupted. If you are sure no other 
processes are using this profile, you can unlock the profile and relaunch Chromium.

Die you find a solution to this?

The solution is to run @bamarch 's script as another user, not the user which runs the node-export-server. Then the locked profile is created for that user, being no problem anymore for the user being active when the container is running. In my Dockerfile I'm running the script as root, chown-ing the complete node-export-server directory to user node, and afterwards setting the user via USER node.

@level420
Copy link
Contributor

level420 commented Aug 1, 2024

@cvasseng one more obstacle for creating a completely self containing docker image is when using older Highcharts versions within node-export-server, because older versions do not offer all the modules expected to be available as documented in manifest.json. See

} else if (Object.keys(manifest.modules || {}).length !== numberOfModules) {

I've successfully integrated @bamarch 's script in my Dockerfile, all needed sources for that specific old Highcharts version are downloaded during image creation, but when starting the container, the export server refetches all sources again, because of the mismatch of the modules available in the cache.

In this situation we'd need a command line switch which is the opposite of HIGHCHARTS_FORCE_FETCH e.g. HIGHCHARTS_PREVENT_FETCH or similar, which completely disables the fetching or re-fetching.

@level420
Copy link
Contributor

level420 commented Aug 1, 2024

ATM I'm overriding lib/cache.js with my own modified version, where I brute force stop the cache update by setting

requestUpdate = false;

before

if (requestUpdate) {

@bamarch
Copy link

bamarch commented Feb 19, 2025

It's also possible that we could add Highcharts as an optional peer dependency, so that you could install Highcharts itself through NPM in your dockerfile and lock that to the version you require.

I have used this approach together with a custom CDN.

In our case, we use the built-in express instance to create a new route to simulate the "CDN". The endpoint reads the files from the node_modules/highcharts.

The default CDN has started rate-limiting more strictly (see #633)

Using the above approach from noxify avoids running into that problem because it avoids using the default CDN

So I would recommend using the "local cdn backed by highcharts node_modules from npm" approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants