Skip to content

Commit c71d9f8

Browse files
committed
markup: add --citeproc to pandoc converter
Adds the citeproc filter to the pandoc converter if pandoc >= 2.11 is available. There are several PRs for it this feature already. However, I think simply adding `--citeproc` is the cleanest way to enable this feature, with the option to flesh it out later, e.g., in gohugoio#7529. Some PRs and issues attempt adding more config options to Hugo which indirectly configure pandoc, but I think simply configuring Pandoc via Pandoc itself is simpler, as it is already possible with two YAML blocks -- one for Hugo, and one for Pandoc: --- title: This is the Hugo YAML block --- --- bibliography: assets/pandoc-yaml-block-bibliography.bib ... Document content with @citation! There are other useful options, e.g., gohugoio#4800 attempts to use `nocite`, which works out of the box with this PR: --- title: This is the Hugo YAML block --- --- bibliography: assets/pandoc-yaml-block-bibliography.bib nocite: | @* ... Document content with no citations but a full bibliography: ## Bibliography Other useful options are `csl: ...` and `link-citations: true`, which set the path to a custom CSL file and create HTML links between the references and the bibliography. The following issues and PRs are related: - Add support for parsing citations and Jupyter notebooks via Pandoc and/or Goldmark extension gohugoio#6101 Bundles multiple requests, this PR tackles citation parsing. - WIP: Bibliography with Pandoc gohugoio#4800 Passes the frontmatter to Pandoc and still uses `--filter pandoc-citeproc` instead of `--citeproc`. - Allow configuring Pandoc gohugoio#7529 That PR is much more extensive and might eventually supersede this PR, but I think --bibliography and --citeproc should be independent options (--bibliography should be optional and citeproc can always be specified). - Pandoc - allow citeproc extension to be invoked, with bibliography. gohugoio#8610 Similar to gohugoio#7529, gohugoio#8610 adds a new config option to Hugo. I think passing --citeproc and letting the users decide on the metadata they want to pass to pandoc is better, albeit uglier.
1 parent 69f7c73 commit c71d9f8

File tree

4 files changed

+168
-11
lines changed

4 files changed

+168
-11
lines changed

docs/content/en/content-management/formats.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ Hugo passes reasonable default arguments to these external helpers by default:
4848

4949
- `asciidoctor`: `--no-header-footer -`
5050
- `rst2html`: `--leave-comments --initial-header-level=2`
51-
- `pandoc`: `--mathjax --citeproc`
51+
- `pandoc`: `--mathjax` and, for pandoc >= 2.11, `--citeproc`
5252

5353
{{% warning "Performance of External Helpers" %}}
5454
Because additional formats are external commands, generation performance will rely heavily on the performance of the external tool you are using. As this feature is still in its infancy, feedback is welcome.
@@ -137,7 +137,8 @@ This will render in your HTML as:
137137
```
138138
You will have to [add MathJax](https://www.mathjax.org/#gettingstarted) to your template to properly render the math.
139139

140-
Additionally, Pandoc enables [citations](https://pandoc.org/MANUAL.html#extension-citations) using, e.g., [BibTeX files](https://en.wikibooks.org/wiki/LaTeX/Bibliography_Management#BibTeX):
140+
For **Pandoc >= 2.11**, you can use [citations](https://pandoc.org/MANUAL.html#extension-citations).
141+
One way is to employ [BibTeX files](https://en.wikibooks.org/wiki/LaTeX/Bibliography_Management#BibTeX) to cite:
141142

142143
```
143144
---
@@ -149,9 +150,10 @@ bibliography: assets/bibliography.bib
149150
This is a citation: @Doe2022
150151
```
151152

152-
Note that Hugo will **not** pass its metadata YAML block to Pandoc; however, it will pass the **second** meta data block, denoted with `---` and `...` to Pandoc. Thus, all pandoc settings should go there.
153+
Note that Hugo will **not** pass its metadata YAML block to Pandoc; however, it will pass the **second** meta data block, denoted with `---` and `...` to Pandoc.
154+
Thus, all Pandoc settings should go there.
153155

154-
You can also add all elements from a bibliography file (without citing them first) using:
156+
You can also add all elements from a bibliography file (without citing them explicitly) using:
155157

156158
```
157159
---

markup/pandoc/convert.go

+56-3
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,17 @@
1515
package pandoc
1616

1717
import (
18+
"bytes"
19+
"fmt"
20+
"strings"
21+
"sync"
22+
23+
"github.com/gohugoio/hugo/common/collections"
1824
"github.com/gohugoio/hugo/common/hexec"
1925
"github.com/gohugoio/hugo/htesting"
2026
"github.com/gohugoio/hugo/identity"
21-
"github.com/gohugoio/hugo/markup/internal"
22-
2327
"github.com/gohugoio/hugo/markup/converter"
28+
"github.com/gohugoio/hugo/markup/internal"
2429
)
2530

2631
// Provider is the package entry point.
@@ -64,7 +69,10 @@ func (c *pandocConverter) getPandocContent(src []byte, ctx converter.DocumentCon
6469
" Leaving pandoc content unrendered.")
6570
return src, nil
6671
}
67-
args := []string{"--mathjax", "--citeproc"}
72+
args := []string{"--mathjax"}
73+
if supportsCitations(c.cfg) {
74+
args = append(args[:], "--citeproc")
75+
}
6876
return internal.ExternallyRenderContent(c.cfg, ctx, src, binaryName, args)
6977
}
7078

@@ -77,6 +85,51 @@ func getPandocBinaryName() string {
7785
return ""
7886
}
7987

88+
var versionOnce sync.Once
89+
90+
// getPandocVersion parses the pandoc version output
91+
func getPandocVersion(cfg converter.ProviderConfig) (string, error) {
92+
var version string
93+
var err error
94+
95+
versionOnce.Do(func() {
96+
argsv := collections.StringSliceToInterfaceSlice([]string{"--version"})
97+
98+
var out bytes.Buffer
99+
argsv = append(argsv, hexec.WithStdout(&out))
100+
101+
cmd, err := cfg.Exec.New(pandocBinary, argsv...)
102+
if err != nil {
103+
version = ""
104+
return
105+
}
106+
107+
err = cmd.Run()
108+
if err != nil {
109+
cfg.Logger.Errorf("%s --version: %v", pandocBinary, err)
110+
}
111+
112+
outbytes := bytes.Replace(out.Bytes(), []byte("\r"), []byte(""), -1)
113+
output := strings.Split(string(outbytes), "\n")[0]
114+
version = strings.Split(output, " ")[1]
115+
})
116+
117+
return version, err
118+
}
119+
120+
// SupportsCitations returns true for pandoc versions >= 2.11, which include citeproc
121+
func supportsCitations(cfg converter.ProviderConfig) bool {
122+
pandocVersion, err := getPandocVersion(cfg)
123+
supportsCitations := pandocVersion >= "2.11" && err != nil
124+
if htesting.SupportsAll() {
125+
if !supportsCitations {
126+
panic(fmt.Sprintf("pandoc %s does not support citations", pandocVersion))
127+
}
128+
return true
129+
}
130+
return supportsCitations
131+
}
132+
80133
// Supports returns whether Pandoc is installed on this computer.
81134
func Supports() bool {
82135
hasBin := getPandocBinaryName() != ""

markup/pandoc/convert_test.go

+100-4
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,114 @@ import (
2525
qt "github.com/frankban/quicktest"
2626
)
2727

28-
func TestConvert(t *testing.T) {
28+
func setupTestConverter(t *testing.T) (*qt.C, converter.Converter, converter.ProviderConfig) {
2929
if !Supports() {
3030
t.Skip("pandoc not installed")
3131
}
3232
c := qt.New(t)
3333
sc := security.DefaultConfig
3434
sc.Exec.Allow = security.NewWhitelist("pandoc")
35-
p, err := Provider.New(converter.ProviderConfig{Exec: hexec.New(sc), Logger: loggers.NewErrorLogger()})
35+
cfg := converter.ProviderConfig{Exec: hexec.New(sc), Logger: loggers.NewErrorLogger()}
36+
p, err := Provider.New(cfg)
3637
c.Assert(err, qt.IsNil)
3738
conv, err := p.New(converter.DocumentContext{})
3839
c.Assert(err, qt.IsNil)
39-
b, err := conv.Convert(converter.RenderContext{Src: []byte("testContent")})
40+
return c, conv, cfg
41+
}
42+
43+
func TestConvert(t *testing.T) {
44+
c, conv, _ := setupTestConverter(t)
45+
output, err := conv.Convert(converter.RenderContext{Src: []byte("testContent")})
4046
c.Assert(err, qt.IsNil)
41-
c.Assert(string(b.Bytes()), qt.Equals, "<p>testContent</p>\n")
47+
c.Assert(string(output.Bytes()), qt.Equals, "<p>testContent</p>\n")
48+
}
49+
50+
func runCiteprocTest(t *testing.T, content string, expected string) {
51+
c, conv, cfg := setupTestConverter(t)
52+
if !supportsCitations(cfg) {
53+
t.Skip("pandoc does not support citations")
54+
}
55+
output, err := conv.Convert(converter.RenderContext{Src: []byte(content)})
56+
c.Assert(err, qt.IsNil)
57+
c.Assert(string(output.Bytes()), qt.Equals, expected)
58+
}
59+
60+
func TestCiteprocWithHugoMeta(t *testing.T) {
61+
content := `
62+
---
63+
title: Test
64+
published: 2022-05-30
65+
---
66+
testContent
67+
`
68+
expected := "<p>testContent</p>\n"
69+
runCiteprocTest(t, content, expected)
70+
}
71+
72+
func TestCiteprocWithPandocMeta(t *testing.T) {
73+
content := `
74+
---
75+
---
76+
---
77+
...
78+
testContent
79+
`
80+
expected := "<p>testContent</p>\n"
81+
runCiteprocTest(t, content, expected)
82+
}
83+
84+
func TestCiteprocWithBibliography(t *testing.T) {
85+
content := `
86+
---
87+
---
88+
---
89+
bibliography: testdata/bibliography.bib
90+
...
91+
testContent
92+
`
93+
expected := "<p>testContent</p>\n"
94+
runCiteprocTest(t, content, expected)
95+
}
96+
97+
func TestCiteprocWithExplicitCitation(t *testing.T) {
98+
content := `
99+
---
100+
---
101+
---
102+
bibliography: testdata/bibliography.bib
103+
...
104+
@Doe2022
105+
`
106+
expected := `<p><span class="citation" data-cites="Doe2022">Doe and Mustermann
107+
(2022)</span></p>
108+
<div id="refs" class="references csl-bib-body hanging-indent"
109+
role="doc-bibliography">
110+
<div id="ref-Doe2022" class="csl-entry" role="doc-biblioentry">
111+
Doe, Jane, and Max Mustermann. 2022. <span>“A Treatise on Hugo
112+
Tests.”</span> <em>Hugo Websites</em>.
113+
</div>
114+
</div>
115+
`
116+
runCiteprocTest(t, content, expected)
117+
}
118+
119+
func TestCiteprocWithNocite(t *testing.T) {
120+
content := `
121+
---
122+
---
123+
---
124+
bibliography: testdata/bibliography.bib
125+
nocite: |
126+
@*
127+
...
128+
`
129+
expected := `<div id="refs" class="references csl-bib-body hanging-indent"
130+
role="doc-bibliography">
131+
<div id="ref-Doe2022" class="csl-entry" role="doc-biblioentry">
132+
Doe, Jane, and Max Mustermann. 2022. <span>“A Treatise on Hugo
133+
Tests.”</span> <em>Hugo Websites</em>.
134+
</div>
135+
</div>
136+
`
137+
runCiteprocTest(t, content, expected)
42138
}
+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
@article{Doe2022,
2+
author = "Jane Doe and Max Mustermann",
3+
title = "A Treatise on Hugo Tests",
4+
journal = "Hugo Websites",
5+
year = "2022",
6+
}

0 commit comments

Comments
 (0)