diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1e63a220..e9a6fa5f 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,7 @@ All notable changes to PDFOxide are documented here.
 
 ### Fixed
 
+- **Cross-document font cache now keys Type0/Identity-H fonts by `/ToUnicode` content (extends #595, #597, #598)** — the cross-document cache hardening in #595/#597/#598 folds the `/ToUnicode` *reference* (object id/gen) into the font identity hash and keeps *canonical* subset fonts (`AAAAAA+`) out of the shared cache. This extends that coverage to two cases the reference-based key doesn't reach: a non-canonical subset tag such as `/CIDFont+F1` (emitted by some generators) stays eligible for cross-document sharing, and PDFs produced from a common template reuse the same `/ToUnicode` object number — so two genuinely different fonts that merely share a `/BaseFont` name produced an identical key. Processed in one long-lived process, a later document was then served an earlier font's parsed `FontInfo` and its glyphs decoded through the wrong `/ToUnicode` — a constant-offset cipher (`SUMMARY` → `6800$5<`) or control/PUA characters — though each document extracted correctly in isolation. The identity hash now folds the `/ToUnicode` stream *bytes*, the embedded `/FontFile{,2,3}` bytes, the descendant `/Subtype`, and a stream-form descendant `/CIDToGIDMap`, so same-named-but-different fonts get distinct keys regardless of subset-tag form or object reuse, while genuinely identical fonts still deduplicate across documents (the cache's purpose is preserved). Same bug class as the `/Widths` poisoning fixed in #598.
 - **Watermark annotations rendered as nothing in compliant viewers** — a watermark's `/AP` appearance was serialized as a stream nested *directly* inside the annotation dictionary (`/AP <</N <<…>> stream … endstream>>`). A PDF stream must be an indirect object (ISO 32000-1:2008 §7.3.8); the inline form is invalid, so spec-compliant readers (e.g. MuPDF/PyMuPDF) rejected the annotation with "invalid key in dict" and the watermark never appeared — even though the bytes were present in the file. A shared `hoist_appearance_streams` helper now lifts nested `/N`, `/D`, and `/R` appearance streams (including named-state sub-dictionaries) into freshly allocated indirect objects and replaces the slot with a reference, applied on both the `DocumentBuilder` writer and the existing-page `DocumentEditor::save_page` paths. Verified end-to-end with MuPDF: the watermark now parses and renders on both paths.
 - **Fixed Python type stubs leaking the pyo3 `Py<Self>` receiver as a positional parameter** — methods implemented in Rust with a by-value receiver (`fn page(slf_handle: Py<Self>, …)` — the idiom pyo3 uses to hand a method an owned handle to its own instance) were emitted by the rylai stub generator with that receiver re-exposed *alongside* the injected `self`.
 
diff --git a/src/document.rs b/src/document.rs
index 5ff6dac9..6a4fba16 100644
--- a/src/document.rs
+++ b/src/document.rs
@@ -14570,35 +14570,64 @@ impl PdfDocument {
         Some(h)
     }
 
-    /// Document-aware extension of `font_identity_hash_cheap` that resolves
-    /// `/DescendantFonts` references on Type0 fonts and folds the descendant
-    /// CIDFont's width metrics (`/DW`, `/DW2`, `/W`, `/W2`) into the hash.
-    ///
-    /// Without this, two Type0 fonts whose Type0 dicts have identical inline
-    /// shape (same BaseFont, Encoding, ToUnicode/DescendantFonts refs) but
-    /// whose referenced CIDFonts carry different vertical metrics collide on
-    /// the Layer 5/6 caches — the second document silently inherits the
-    /// first's `w1y` and renders vertical text at the wrong advance. This is
-    /// the same bug class as the ToUnicode-stream poisoning fixed in
-    /// `a327bcd` and the `/Widths` poisoning fixed in #598, applied to the
-    /// descendant CIDFont's horizontal AND vertical width arrays.
-    ///
-    /// Cost: one `load_object` per descendant CIDFont (typically one) on the
-    /// first call; subsequent calls hit `font_id_hash_cache`. The descendant
-    /// load is the same work `FontInfo::from_dict` will do later, so the
-    /// marginal cost when a font actually needs parsing is zero; the only
-    /// new work is on cache *hits* that previously skipped descendant
-    /// resolution entirely. In return we trade off one indirect-ref load per
-    /// unique Type0 font per process for correctness on /W2 + /DW2.
+    /// Document-aware extension of `font_identity_hash_cheap` that folds the
+    /// *content* of a font's document-specific streams — its `/ToUnicode` CMap
+    /// and embedded font program(s) — plus the descendant CIDFont's width
+    /// metrics (`/DW`, `/DW2`, `/W`, `/W2`) and stream-form `/CIDToGIDMap` into
+    /// the identity hash.
+    ///
+    /// Why content, not just references: `font_identity_hash_cheap` folds only
+    /// the *reference* (object id/gen) of `/ToUnicode`, and the global cache is
+    /// skipped only for *canonical* subset fonts (`AAAAAA+`, six uppercase
+    /// letters + `+`; see `is_subset_basefont`). A non-canonical subset tag
+    /// such as `/CIDFont+F1` is therefore still shared cross-document, and
+    /// PDFs emitted from a common template reuse the same `/ToUnicode` object
+    /// number — so two genuinely different fonts that merely share a
+    /// `/BaseFont` name produce an identical cheap hash. Keyed only by that
+    /// hash, the cross-document global cache (Layer 6) served a later document
+    /// the *earlier* font's parsed `FontInfo`, and its glyph→Unicode mapping
+    /// came out as a constant-offset cipher or control/PUA junk (e.g.
+    /// `SUMMARY` → `6800$5<`). Folding the `/ToUnicode` stream bytes — and the
+    /// embedded `/FontFile{,2,3}` bytes — gives such fonts distinct keys so
+    /// they can never collide regardless of subset-tag form or object reuse,
+    /// while genuinely identical fonts still dedup. This completes the
+    /// cross-document hardening from #595/#597/#598 (which folded the
+    /// `/ToUnicode` *reference* and the `/Widths`, and excluded canonical
+    /// `AAAAAA+` subsets), applied to the field that actually decodes text.
+    ///
+    /// Cost: a few extra `load_object` calls (the `/ToUnicode` stream, each
+    /// descendant CIDFont, the `/FontDescriptor`s and their font programs) on
+    /// the first encounter of a font per document; subsequent calls hit
+    /// `font_id_hash_cache`, and the loads themselves are served from the
+    /// object cache that `FontInfo::from_dict` populates anyway. Stream bytes
+    /// are folded *raw* (still encoded) — see `fold_stream_bytes`.
     fn font_identity_hash_with_descendants(&self, font_obj: &Object) -> u64 {
         use std::hash::{Hash, Hasher};
         // Seed with the cheap inline hash so existing identity coverage is
-        // preserved bit-for-bit when there are no descendants to fold in.
+        // preserved bit-for-bit when there are no streams/descendants to fold.
         let base = Self::font_identity_hash_cheap(font_obj);
         let mut hasher = std::collections::hash_map::DefaultHasher::new();
         base.hash(&mut hasher);
 
         if let Some(d) = font_obj.as_dict() {
+            // /ToUnicode stream BYTES — the decisive discriminator. The cheap
+            // hash folds only this stream's reference; folding its content is
+            // what stops same-named, differently-mapped fonts from colliding
+            // across documents when the cheap key matches (#595).
+            if let Some(to_unicode) = d.get("ToUnicode") {
+                17u8.hash(&mut hasher);
+                self.fold_stream_bytes(to_unicode, &mut hasher);
+            }
+
+            // Simple fonts (Type1/TrueType) carry their embedded program on the
+            // top-level /FontDescriptor. Two subset fonts that share a
+            // /BaseFont name but embed different glyph programs must not alias.
+            if let Some(fd) = d.get("FontDescriptor") {
+                if let Some(fd_obj) = self.resolve_indirect_for_hash(fd) {
+                    self.fold_font_program(&fd_obj, 18, &mut hasher);
+                }
+            }
+
             if let Some(Object::Array(arr)) = d.get("DescendantFonts") {
                 // Domain separator for the descendant section.
                 11u8.hash(&mut hasher);
@@ -14646,6 +14675,35 @@ impl PdfDocument {
                         16u8.hash(&mut hasher);
                         Self::hash_pdf_object_deterministic(csi, &mut hasher);
                     }
+                    // Descendant /Subtype: CIDFontType0 (CFF) and CIDFontType2
+                    // (TrueType) are not interchangeable even with identical
+                    // name + metrics; the top-level Subtype is `Type0` for both.
+                    if let Some(st) = dd.get("Subtype") {
+                        19u8.hash(&mut hasher);
+                        Self::hash_pdf_object_deterministic(st, &mut hasher);
+                    }
+                    // Embedded CIDFont program lives on the descendant's
+                    // /FontDescriptor (/FontFile2 for TrueType, /FontFile3 for
+                    // CFF). Folded under a distinct section so it cannot alias
+                    // a simple font's top-level program.
+                    if let Some(fd) = dd.get("FontDescriptor") {
+                        if let Some(fd_obj) = self.resolve_indirect_for_hash(fd) {
+                            self.fold_font_program(&fd_obj, 20, &mut hasher);
+                        }
+                    }
+                    // Descendant /CIDToGIDMap: the *stream* form remaps
+                    // CID→glyph (§9.7.4.3), so two otherwise-identical embedded
+                    // CIDFontType2 fonts with different maps select different
+                    // glyphs and must not alias. The `/Identity` name — and an
+                    // absent entry, which defaults to Identity — fold nothing,
+                    // so the common path's key is unchanged (and an explicit
+                    // `/Identity` still dedups with an absent one).
+                    if let Some(c2g) = dd.get("CIDToGIDMap") {
+                        if !matches!(c2g, Object::Name(_)) {
+                            21u8.hash(&mut hasher);
+                            self.fold_stream_bytes(c2g, &mut hasher);
+                        }
+                    }
                 }
             }
         }
@@ -14653,6 +14711,74 @@ impl PdfDocument {
         hasher.finish()
     }
 
+    /// Resolve a single level of indirection for hashing: returns the
+    /// referenced object, the object itself when already inline, or `None`
+    /// when a reference cannot be loaded (cycle/missing). Used only to reach a
+    /// `/FontDescriptor` dict — it never re-enters the font dict, so it cannot
+    /// loop.
+    fn resolve_indirect_for_hash(&self, obj: &Object) -> Option<Object> {
+        match obj {
+            Object::Reference(r) => self.load_object(*r).ok(),
+            other => Some(other.clone()),
+        }
+    }
+
+    /// Fold the *raw* bytes of a (possibly indirectly-referenced) stream into
+    /// the hash. Folds nothing when the object is absent, unreadable, or not a
+    /// stream.
+    ///
+    /// Raw — still-encoded — bytes are deliberate. They are a sufficient
+    /// discriminator: different decoded content yields different encoded bytes
+    /// under any deterministic filter, so this never produces a *false* dedup
+    /// (two different fonts sharing a key). It avoids inflating large font
+    /// programs on the cache-key path. The only cost is a *missed* dedup when
+    /// the same logical content is stored under two different filters
+    /// (e.g. raw vs. FlateDecode) — harmless, and not a pattern a single
+    /// producer emits within a corpus.
+    fn fold_stream_bytes<H: std::hash::Hasher>(&self, obj: &Object, hasher: &mut H) {
+        use std::hash::Hash;
+        let owned;
+        let stream: &Object = match obj {
+            Object::Stream { .. } => obj,
+            Object::Reference(r) => match self.load_object(*r) {
+                Ok(o) => {
+                    owned = o;
+                    &owned
+                },
+                Err(_) => return,
+            },
+            _ => return,
+        };
+        if let Object::Stream { data, .. } = stream {
+            (data.len() as u64).hash(hasher);
+            data.as_ref().hash(hasher);
+        }
+    }
+
+    /// Fold any embedded font program (`/FontFile`, `/FontFile2`,
+    /// `/FontFile3`) reachable from a `/FontDescriptor` dict into the hash,
+    /// namespaced by `section` so a simple font's program and a descendant
+    /// CIDFont's program cannot alias each other.
+    fn fold_font_program<H: std::hash::Hasher>(
+        &self,
+        descriptor: &Object,
+        section: u8,
+        hasher: &mut H,
+    ) {
+        use std::hash::Hash;
+        let dict = match descriptor.as_dict() {
+            Some(d) => d,
+            None => return,
+        };
+        for (variant, key) in ["FontFile", "FontFile2", "FontFile3"].iter().enumerate() {
+            if let Some(ff) = dict.get(*key) {
+                section.hash(hasher);
+                (variant as u8).hash(hasher);
+                self.fold_stream_bytes(ff, hasher);
+            }
+        }
+    }
+
     /// Hash a PDF `Object` deterministically. Used by the descendant-aware
     /// font identity hash to fold raw width-array content into the key.
     ///
diff --git a/tests/test_font_cache_cross_document.rs b/tests/test_font_cache_cross_document.rs
new file mode 100644
index 00000000..a72caf31
--- /dev/null
+++ b/tests/test_font_cache_cross_document.rs
@@ -0,0 +1,250 @@
+//! Cross-document font-cache collision regression (completes #595, #597, #598).
+//!
+//! The process-global font cache (`fonts::global_cache`) is keyed by a font
+//! *identity hash*. The #595 hardening folds the `/ToUnicode` *reference*
+//! (object id/gen) into that hash and keeps *canonical* subset fonts
+//! (`AAAAAA+`) out of the cache. A non-canonical subset tag such as
+//! `/CIDFont+F1` falls outside that exclusion, and template-emitted PDFs reuse
+//! the same `/ToUnicode` object number, so the reference-keyed hash can still
+//! match for two genuinely different fonts — the later document is then served
+//! the earlier font's parsed `FontInfo`, and its glyphs decode through the
+//! wrong `/ToUnicode` and come out garbled. Folding the stream's bytes (not
+//! just its reference) distinguishes them and closes this case.
+//!
+//! Both PDFs here are built in memory (per the repo's no-binary-fixtures
+//! convention) and are byte-for-byte identical except for the CID→Unicode
+//! mapping: same `/BaseFont` (`/CIDFont+F1`, the non-canonical subset tag some
+//! real generators emit), same object numbers, same width metrics — only the
+//! `/ToUnicode` stream and the matching content-stream CIDs differ. That is the
+//! exact shape that triggered the leak.
+//!
+//! Oracle: correct text contains the header `SUMMARY`; a font decoded through
+//! another document's `/ToUnicode` does not.
+
+use pdf_oxide::document::PdfDocument;
+use pdf_oxide::fonts::global_cache::{clear_global_font_cache, global_font_cache_stats};
+use std::sync::Mutex;
+
+/// Serializes the two tests in this binary: both assert against the
+/// process-global cache, so they must not run concurrently.
+static CACHE_LOCK: Mutex<()> = Mutex::new(());
+
+/// Lines rendered on the single page. The content is fabricated and trivial;
+/// only the presence of `SUMMARY` matters to the oracle.
+const LINES: &[&str] = &[
+    "SUMMARY",
+    "Synthetic document for the font-cache regression.",
+    "Text is recoverable only via the ToUnicode CMap.",
+];
+
+/// Build a minimal non-embedded Type0/Identity-H PDF in memory.
+///
+/// Every document shares one object layout and `/BaseFont` name, so their cheap
+/// identity hashes collide. `cid_base` shifts the (otherwise sequential) glyph
+/// indices, mirroring a real subset font whose CIDs are arbitrary indices
+/// unrelated to Unicode and recoverable only through `/ToUnicode`. Two
+/// documents built with different `cid_base` therefore carry byte-different
+/// `/ToUnicode` streams and content-stream CIDs while remaining identical in
+/// every field the pre-fix key looked at.
+fn build_type0_pdf(cid_base: u16, cid_to_gid: Option<&[u8]>) -> Vec<u8> {
+    // Distinct characters in first-appearance order; CID = cid_base + index.
+    let mut chars: Vec<char> = Vec::new();
+    for ch in LINES.iter().flat_map(|l| l.chars()) {
+        if !chars.contains(&ch) {
+            chars.push(ch);
+        }
+    }
+    let cid = |ch: char| -> u16 {
+        let idx = chars.iter().position(|&c| c == ch).unwrap() as u16;
+        cid_base + idx
+    };
+
+    // Content stream: 2-byte CIDs, one `Tj` per line.
+    let mut content = String::from("BT\n/F1 13 Tf\n15 TL\n40 770 Td\n");
+    for line in LINES {
+        let hex: String = line.chars().map(|ch| format!("{:04X}", cid(ch))).collect();
+        content.push_str(&format!("<{hex}> Tj\nT*\n"));
+    }
+    content.push_str("ET");
+
+    // ToUnicode CMap inverting the CID→Unicode mapping.
+    let bfchar: String = chars
+        .iter()
+        .map(|&ch| format!("<{:04X}> <{:04X}>", cid(ch), ch as u32))
+        .collect::<Vec<_>>()
+        .join("\n");
+    let cmap = format!(
+        "/CIDInit /ProcSet findresource begin\n12 dict begin\nbegincmap\n\
+         /CIDSystemInfo <</Registry (Adobe) /Ordering (UCS) /Supplement 0>> def\n\
+         /CMapName /Adobe-Identity-UCS def\n/CMapType 2 def\n\
+         1 begincodespacerange\n<0000> <FFFF>\nendcodespacerange\n\
+         {} beginbfchar\n{}\nendbfchar\n\
+         endcmap\nCMapName currentdict /CMap defineresource pop\nend\nend",
+        chars.len(),
+        bfchar
+    );
+
+    // /CIDToGIDMap defaults to the `/Identity` name; `cid_to_gid` switches it to
+    // the stream form (object 9) so a test can vary its bytes.
+    let cid_to_gid_entry = if cid_to_gid.is_some() {
+        "/CIDToGIDMap 9 0 R"
+    } else {
+        "/CIDToGIDMap /Identity"
+    };
+
+    let mut objs: Vec<Vec<u8>> = vec![
+        b"<< /Type /Catalog /Pages 2 0 R >>".to_vec(),
+        b"<< /Type /Pages /Kids [3 0 R] /Count 1 >>".to_vec(),
+        b"<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] \
+          /Resources << /Font << /F1 5 0 R >> >> /Contents 4 0 R >>"
+            .to_vec(),
+        format!("<< /Length {} >>\nstream\n{content}\nendstream", content.len()).into_bytes(),
+        b"<< /Type /Font /Subtype /Type0 /BaseFont /CIDFont+F1 /Encoding /Identity-H \
+          /DescendantFonts [6 0 R] /ToUnicode 8 0 R >>"
+            .to_vec(),
+        format!(
+            "<< /Type /Font /Subtype /CIDFontType2 /BaseFont /CIDFont+F1 \
+             /CIDSystemInfo << /Registry (Adobe) /Ordering (Identity) /Supplement 0 >> \
+             /FontDescriptor 7 0 R /DW 500 {cid_to_gid_entry} >>"
+        )
+        .into_bytes(),
+        // No /FontFile* — non-embedded, like the real garbled documents.
+        b"<< /Type /FontDescriptor /FontName /CIDFont+F1 /Flags 4 \
+          /FontBBox [0 -200 1000 900] /ItalicAngle 0 /Ascent 800 /Descent -200 \
+          /CapHeight 700 /StemV 80 /MissingWidth 500 >>"
+            .to_vec(),
+        format!("<< /Length {} >>\nstream\n{cmap}\nendstream", cmap.len()).into_bytes(),
+    ];
+    if let Some(map) = cid_to_gid {
+        let mut obj = format!("<< /Length {} >>\nstream\n", map.len()).into_bytes();
+        obj.extend_from_slice(map);
+        obj.extend_from_slice(b"\nendstream");
+        objs.push(obj);
+    }
+
+    // Assemble with a byte-accurate xref table.
+    let mut out: Vec<u8> = b"%PDF-1.7\n".to_vec();
+    let mut offsets = Vec::with_capacity(objs.len());
+    for (i, body) in objs.iter().enumerate() {
+        offsets.push(out.len());
+        out.extend_from_slice(format!("{} 0 obj\n", i + 1).as_bytes());
+        out.extend_from_slice(body);
+        out.extend_from_slice(b"\nendobj\n");
+    }
+    let xref_off = out.len();
+    let size = objs.len() + 1;
+    out.extend_from_slice(format!("xref\n0 {size}\n0000000000 65535 f \n").as_bytes());
+    for off in &offsets {
+        out.extend_from_slice(format!("{off:010} 00000 n \n").as_bytes());
+    }
+    out.extend_from_slice(
+        format!("trailer\n<< /Size {size} /Root 1 0 R >>\nstartxref\n{xref_off}\n%%EOF").as_bytes(),
+    );
+    out
+}
+
+fn extract_first_page(bytes: Vec<u8>) -> String {
+    let doc = PdfDocument::from_bytes(bytes).expect("parse synthetic PDF");
+    doc.extract_text(0).expect("extract page 0")
+}
+
+/// Several documents that share a `/BaseFont` name but map glyphs differently
+/// must each decode through their own `/ToUnicode`, even when processed
+/// back-to-back in one process without clearing the cache between them.
+#[test]
+fn distinct_tounicode_fonts_do_not_collide_across_documents() {
+    let _guard = CACHE_LOCK.lock().unwrap_or_else(|e| e.into_inner());
+    clear_global_font_cache();
+
+    // Distinct CID bases ⇒ distinct ToUnicode streams. The first document
+    // primes the global cache; before the fix, every later one inherited its
+    // mapping. The bases are arbitrary, only mutually distinct.
+    let bases = [3u16, 1000, 2000, 40000];
+    let mut garbled = Vec::new();
+    for base in bases {
+        let text = extract_first_page(build_type0_pdf(base, None));
+        if !text.contains("SUMMARY") {
+            let preview: String = text.chars().take(48).collect();
+            garbled.push(format!("cid_base={base}: {preview:?}"));
+        }
+    }
+
+    assert!(
+        garbled.is_empty(),
+        "{} of {} same-named fonts decoded through another document's ToUnicode \
+         (cross-document font-cache collision):\n  {}",
+        garbled.len(),
+        bases.len(),
+        garbled.join("\n  ")
+    );
+}
+
+/// The precise key must not regress the dedup the global cache exists for:
+/// a byte-identical font (different document) is a cache *hit* with no new
+/// entry, while a font with a different `/ToUnicode` gets its own entry.
+#[test]
+fn identical_fonts_dedup_while_distinct_fonts_get_separate_entries() {
+    let _guard = CACHE_LOCK.lock().unwrap_or_else(|e| e.into_inner());
+    clear_global_font_cache();
+    assert_eq!(global_font_cache_stats().0, 0, "cache should be empty after clear");
+
+    // Each document defines exactly one cross-document-shareable Type0 font, so
+    // the cache grows by one entry per *distinct* font.
+    assert!(extract_first_page(build_type0_pdf(3, None)).contains("SUMMARY"));
+    let after_first = global_font_cache_stats().0;
+    assert_eq!(after_first, 1, "first document inserts exactly one font");
+
+    // Same bytes, brand-new PdfDocument: must hit the global cache, not reinsert.
+    assert!(extract_first_page(build_type0_pdf(3, None)).contains("SUMMARY"));
+    assert_eq!(
+        global_font_cache_stats().0,
+        after_first,
+        "an identical font must hit the global cache rather than re-insert"
+    );
+
+    // Different ToUnicode: must get its own entry (the absence of which was the
+    // collision bug) and decode correctly.
+    assert!(extract_first_page(build_type0_pdf(2000, None)).contains("SUMMARY"));
+    assert_eq!(
+        global_font_cache_stats().0,
+        after_first + 1,
+        "a font with a different ToUnicode must not alias the cached one"
+    );
+}
+
+/// A *stream*-form `/CIDToGIDMap` remaps CID→glyph (ISO 32000-1 §9.7.4.3), so
+/// two embedded CIDFontType2 fonts identical in name, `/ToUnicode`, and metrics
+/// but differing in that stream are not interchangeable and must get separate
+/// cache entries. (PR #733 review: the `/Identity` name, the default, still
+/// folds nothing — that case is covered by the tests above.)
+#[test]
+fn stream_cid_to_gid_map_distinguishes_otherwise_identical_fonts() {
+    let _guard = CACHE_LOCK.lock().unwrap_or_else(|e| e.into_inner());
+    clear_global_font_cache();
+
+    // Same cid_base ⇒ identical /ToUnicode and content; the ONLY difference is
+    // the /CIDToGIDMap stream. Sized to cover every CID used (2 bytes/CID), GIDs
+    // kept < 0x80 so it is well-formed map data.
+    let distinct = LINES
+        .iter()
+        .flat_map(|l| l.chars())
+        .fold(Vec::new(), |mut v, c| {
+            if !v.contains(&c) {
+                v.push(c);
+            }
+            v
+        });
+    let len = 2 * (0x21 + distinct.len());
+    let map_a: Vec<u8> = (0..len).map(|i| (i % 0x40) as u8).collect();
+    let mut map_b = map_a.clone();
+    *map_b.last_mut().unwrap() ^= 0x01; // differ by a single byte
+
+    assert!(extract_first_page(build_type0_pdf(0x21, Some(&map_a))).contains("SUMMARY"));
+    let after_a = global_font_cache_stats().0;
+    assert!(extract_first_page(build_type0_pdf(0x21, Some(&map_b))).contains("SUMMARY"));
+    assert_eq!(
+        global_font_cache_stats().0,
+        after_a + 1,
+        "fonts differing only in a stream /CIDToGIDMap must not alias in the cache"
+    );
+}