Skip to content

Switch text layout to line-level shaping for accurate glyph positioning #75

@SimonCropp

Description

@SimonCropp

Problem

TextRenderer.LayoutParagraphWithWidth (both Skia and ImageSharp) measures text at the word level: each run's text is split via SplitIntoWords into individual word-or-whitespace fragments, and each is measured independently with font.MeasureText(word). Fragment widths are summed to get the line width, and fragments are rendered sequentially at currentX += fragment.Width.

This loses cross-word shaping — kerning pairs and bearing adjustments that a shaper would apply to "abc def" as a single run are gone once it becomes ["abc", " ", "def"]. Word measurement also diverges further when the source document stores adjacent characters as separate runs (e.g. cover-letters/02's contact line has 9 runs: "+91 915 5894669", " ", "|", " ", "www.interestingsite.com", " ", "|", " ", "manasi@example.com").

Symptom

Visible extra whitespace before specific characters — most obvious around | separators. Measured pixel width of the contact-info line in cover-letters/02:

  • Expected: 616px total, ~14px gap before each |
  • Skia render: 633px total, 27px gap before the first |, 18px before the second

The drift is non-uniform: it accumulates at fragment boundaries where shaping would normally tighten positions.

Proper fix

Switch the layout pipeline to shape entire lines:

  • Skia: use SKShaper (via SkiaSharp.HarfBuzz) to shape the full line's glyphs and pull per-character advances. Keep fragment identity (for wrapping, styling, underline etc.) but source their widths from the shaped glyph advances instead of font.MeasureText on isolated words.
  • ImageSharp: TextMeasurer.MeasureAdvance already handles shaping when called on the full string. Pre-measure each run's full text in context, then split the measured result into word slices using glyph clusters/char indices.

Line wrapping needs to still operate on word boundaries, but each word's width comes from its position inside the shaped line, not an isolated measurement.

Why this is a standalone task

An earlier attempt to retrofit shaping via post-layout fragment merge caused 158 regressions out of 280 scenarios. Every existing results_{skia,imagesharp}#page_*.verified.png was generated against the old word-by-word measurement, so any measurement change — even one that's objectively closer to Word's output — shifts pixel positions enough to fail the ErrorMetric.Absolute threshold on scenarios that were previously passing.

That means this fix must be committed together with fresh baselines for all affected scenarios.

Estimate

  • Skia implementation (swap to SKShaper, rewire LayoutParagraphWithWidth / LayoutParagraph / LayoutParagraphForMeasurement, update RenderFragment to draw via shaped glyphs or sub-strings): ~1 day
  • ImageSharp implementation (mirror of above using TextMeasurer): ~0.5 day
  • Justified-text gap handling (extraSpacePerGap in RenderParagraph) needs re-derivation from shaped advances: ~0.5 day
  • Table cell layout (RenderParagraphInBounds uses the same path): verify/fix: ~0.5 day
  • Re-baselining ~280 scenarios: force-accept, visually spot-check the categories (cover-letters, resumes, business-plans, etc), compare error metrics against expected to make sure the new baselines are not worse overall: ~1 day
  • Regression chase for anything that breaks in non-obvious ways (RTL, bullets, soft hyphens, inline images mixed with text): ~0.5–1 day

Total: ~4 days of focused work, landed as a single commit to keep the baseline refresh atomic.

Affected files (primary)

  • src/Morph.Skia/Rendering/TextRenderer.csLayoutParagraphWithWidth, LayoutParagraph, LayoutParagraphForMeasurement, RenderParagraph, RenderFragment, SplitIntoWords
  • src/Morph.ImageSharp/Rendering/TextRenderer.cs — same set of methods
  • src/Morph.Skia/Rendering/RenderContext.cs — font creation (may need shaper handle)
  • src/Morph.ImageSharp/Rendering/RenderContext.csMeasureText helper
  • src/Tests/Inputs/**/results_*.verified.png — baselines to refresh

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions