-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tex2svg-page: error handling non-math symbols #16
Comments
Thanks for the report. I've looked into it and here is what is going on: Because the number of entities (like In a browser, this loading is asynchronous, and so involves some careful hand-shaking to manage the delay while waiting for the file to arrive, and that code surrounding the call to translate the entity must be aware that the delay can occur (and so must any code surrounding that, and so on). In node applications, since there is no browser DOM, MathJax uses a light-weight implementation of the browser DOM called the LiteDOM. When MathJax tries to determine which handler to use for a particular document, it asks the handlers it knows about to see if they can handle the document. For the LiteDOM handler, it tries to parse the document to see if it can do so, and if it can, then MathJax uses that handler. But during the parsing of the document containing That certainly is a bug, and I will have to think about how best to handle it. In the meantime, one way to work around the problem is to load all the entities up front so that the definition of require('mathjax-full/js/util/entities/all.js'); to the This does lead to a second issue, however, which is that the output of the program will no longer have the entity, but instead will have the actual unicode character instead. This is because, just as in a browser, the LiteDOM translates entities to characters while it is parsing the file. This is necessary because you could have something like
in your file, and the
and the If you want to have function toEntity(c) {
return '&' + c.charCodeAt(0).toString(16).toUpperCase() + ';';
}
const LiteParser = require('mathjax-full/js/adaptors/lite/Parser.js').LiteParser;
LiteParser.prototype.protectHTML = function (text) {
return text.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/[^\u0000-\u007E]/g, toEntity);
} just after the To get named entities, you can do something like this: const entityName = {
0xA9 : 'copy'
};
function toEntity(c) {
const n = c.charCodeAt(0);
return '&' + (entityName[n] || '#x' + n.toString(16).toUpperCase()) + ';';
}
const LiteParser = require('mathjax-full/js/adaptors/lite/Parser.js').LiteParser;
LiteParser.prototype.protectHTML = function (text) {
return text.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/[^\u0000-\u007E]/g, toEntity);
} where you list the entities that you want to turn back into named entities. It would also be possible to generate the const entities = require('mathjax-full/js/util/Entities.js').entities;
const entityName = {};
Object.keys(entities).forEach((name) => entityName[entities[name].codePointAt(0)] = name); rather than giving the list yourself. If you want to include (unencoded) unicode in your document, then you might need to adjust the regex in last In any case, I think you can get the results you want this way. |
Thanks for the detailed response! I am happy to allow unicode characters in the output, so the replacement code is not necessary for me. However, I have tried the suggestion of adding tex2svg-page modified code#! /usr/bin/env -S node -r esm
/*************************************************************************
*
* direct/tex2svg-page
*
* Uses MathJax v3 to convert all TeX in an HTML document.
*
* ----------------------------------------------------------------------
*
* Copyright (c) 2018 The MathJax Consortium
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
//
// Load the packages needed for MathJax
//
const mathjax = require('mathjax-full/js/mathjax.js').mathjax;
const TeX = require('mathjax-full/js/input/tex.js').TeX;
const SVG = require('mathjax-full/js/output/svg.js').SVG;
const liteAdaptor = require('mathjax-full/js/adaptors/liteAdaptor.js').liteAdaptor;
const RegisterHTMLHandler = require('mathjax-full/js/handlers/html.js').RegisterHTMLHandler;
const AllPackages = require('mathjax-full/js/input/tex/AllPackages.js').AllPackages;
require('mathjax-full/js/util/asyncLoad/node.js');
//
// Get the command-line arguments
//
var argv = require('yargs')
.demand(1).strict()
.usage('$0 [options] file.html > converted.html')
.options({
em: {
default: 16,
describe: 'em-size in pixels'
},
ex: {
default: 8,
describe: 'ex-size in pixels'
},
packages: {
default: AllPackages.sort().join(', '),
describe: 'the packages to use, e.g. "base, ams"'
},
fontCache: {
default: 'global',
describe: 'cache type: local, global, none'
}
})
.argv;
//
// Read the HTML file
//
const htmlfile = require('fs').readFileSync(argv._[0], 'utf8');
//
// Create DOM adaptor and register it for HTML documents
//
const adaptor = liteAdaptor({fontSize: argv.em});
RegisterHTMLHandler(adaptor);
//
// Create input and output jax and a document using them on the content from the HTML file
//
const tex = new TeX({packages: argv.packages.split(/\s*,\s*/)});
const svg = new SVG({fontCache: argv.fontCache, exFactor: argv.ex / argv.em});
const html = mathjax.document(htmlfile, {InputJax: tex, OutputJax: svg});
//
// Typeset the document
//
html.render();
//
// Output the resulting HTML
//
console.log(adaptor.outerHTML(adaptor.root(html.document))); The error message is:
Unfortunately I am not too familiar with the MathJax code - is something more than the require needed? |
Sorry, my fault. I copied the wrong line. It should have been require('mathjax-full/js/util/entities/all.js'); I will change it in the original message, in case anyone else looks for the solution here. |
Wonderful! This worked great! One little annoyance that was easy to work around is that For my purposes, this issue is solved. I will leave it open, since as I understand it does expose a bug, but feel free to close it once it is not useful. Thank you very much for your help! |
Thanks for confirming that it worked for you. Good luck with your project. |
Running
tex2svg-page
on the following html file succeeds:However, once you add the copyright symbol, anywhere in the page, it fails:
with the following error:
This behavior isn't changed even if the symbol is placed inside a footer tag, and the footer tag added to ignore in the mathjax configuration. If my understanding here is correct, this suggests two possible issues:
My use-case is pre-rendering math for a static website -I'm specifically looking for a script that will output an html file with math rendered in the same way as if the mathjax javascript were included on the page, and
tex2svg-page
seemed like a perfect fit here. I'd really appreciate any guidance you might have on these issues.The text was updated successfully, but these errors were encountered: