Skip to content

why macro hiding is broken

Ryan Culpepper edited this page Sep 20, 2019 · 8 revisions

The seek problem

This page describes the main problem that macro hiding tries to solve. The description is organized as a sequence of increasingly difficult situations that macro hiding must deal with.

(In the expansion trees below, I write -> for a single macro step and => to mean "expands to" (but note it might be partial expansion if there is a stop list). I write _ for terms I don't care to name. The trees omit contextual information like environments and stop lists and lift targets and so on.)

Normal expansion

Suppose eH is a use of an opaque macro whose expansion looks like the following:

                      eS => _
                      -------
                        ...
                     ---------
    eH -> eH'        eH' => _
    --------------------------
              eH => _

By "opaque macro" I mean that we want to hide the expansion of eH, but we want to show the expansions of the visible subterms of eH. In particular, when we get to eS, we need to check

  • whether eS is a visible subterm of eH, and if so
  • where in eH it occurred (so we can update that part of eH with subsequent expansion steps)

That is the seek problem.

At this point, one solution is just to remember eH, and go looking for eS as a subterm in eH and calculate the path during the search. (If eH is large then repeatedly searching through it might be slow, though.)

Macro hiding is intensional, not extensional

We only want to show sub-expansions that originate from terms in eH, not expansions from terms that "look the same", even for notions of "the same" finer than S-expression equality.

On the other hand, we do want to track the changes that terms in eH undergo during macro expansion so we don't lose them. The "intension" of a visible term is not a syntax object but an evolutionary chain of syntax objects.

Scopes (or marks and renames, originally)

Consider the case where eH's macro wraps a sub-term in a lambda (or other binding form), such as in the following expansion tree:

                                                              eS => _
                                                             ---------
                                                                ...
                                                            -----------
                      rename(xs to xs^s; body to body^s)    body^s => _
                      -------------------------------------------------
                                    (lambda xs . body) => _
                                    -----------------------
                                              ...
                                           ---------
    eH -> eH'                              eH' => _
    ---------------------------------------------------
                            eH => _

If eS occurs under a lambda with respect to eH', it cannot possibly be identical to a subterm of eH, because lambda adds a scope to its body. But we still want to show it, if it originated as a subterm of eH.

We could keep the same representation and update subterm that got the scope added. But, it's also possible that the term evades the new scope (if the expression is duplicated, for example). So it would be best to keep the old term around too. A hash mapping subterms to paths would be one solution; when a scope is applied, we just add the new subterms to the table.

Pre-expansion and re-expansion

Consider a macro (or primitive form) that calls local-expand. Its expansion tree would look something like the following:

    eH -> _                        eS => _
    -------                        -------
      ...                            ...
   ---------                       -------
   e1 => e1'      e2 = f(e1')      e2 => _
   ----------------------------------------
                    e0 => _

Since in a way the "out of the ordinary" part is the first expansion and not the second, I'll label the first one "pre-expansion".

Pre-expansion happens all the time in the macro expander. Macros can do it by calling local-expand and then using the result (or parts thereof) in their own expansion. Modules head-expand forms in pass 1 to uncover definitions, requires, and provides; in pass 2 they go back to expand the expressions. Internal definition contexts behave similarly, but they transform the pre-expansion results into lets/letrecs/etc before re-expanding.

It's not obvious how macro hiding should interact with pre-expansion. Here are a few possibilities:

  1. macro hiding is disabled for the whole pre-expansion tree

This ensures that it is feasible to show the transformation from e1' to e2 (whether for a macro or a primitive form), but it also means you'll see lots of defines turn into define-values when you expand a module, for example.

  1. if anything is hidden in pre-expansion, carry over the "hiding state" to the second tree (whatever that means)

This generally makes it infeasible to show the f transformation as a step. It also means that "hiding state" must flow down (rootward) the tree too, rather than only up (leafward).

IIRC, the macro stepper currently does (2) if the e0 expansion is in hidden mode (either because e0 is an opaque macro or because we're already in hidden mode when we get to e0 and e0 is not a visible subterm) and (1) in visible mode.

The macro stepper currently represents the "hiding state" (really, the visible subterms) in two different ways. Going "up" (leafward) the tree it is represented as a hash mapping syntax objects to paths. Going "down" (rootward) the tree, the reasons for needing multiple terms at a particular path no longer applies, and the mapping to paths must be updated on the way down as (visible) primitive syntactic forms place expansion results in contexts. So the downward hiding state essentially coincides with the "visible" current term (which the macro stepper must compute anyway).

Adding "dye packs" (syntax-protect)

This is the change that broke macro hiding, and it did so because it invalidated macro stepper's visible-terms representations.

Consider the following macro:

(define-syntax (m1 stx)
  (syntax-case stx ()
    [(m1 x e)
     #`(lambda (x) #,(syntax-protect #'(f e)))]))

(Below, I'll write <e> for the term e wrapped in a dye pack, and e^s for e with the scope s added.)

Consider the following partial expansion of (m1 eS):

(m1 e) -> (lambda (x) <(f e)>) -> (#%plain-lambda (x^s) <(f^s e^s)>^s)

Suppose we want to hide m1. Then e should be considered a visible subterm. If we try to track the visible subterms eagerly, though, we have to update e to e^s, but the renaming the macro stepper receives from #%plain-lambda is <(f e)> to <(f^s e^s)>^s.

(If scope propagation and arming/disarming interact with intensional equality in just the right way---and I think they do---then just fully disarming the before and after renaming terms might "work" some of the time. But I tried it and didn't see an improvement.)

Since we can't eagerly update the goal (the visible terms) with the renaming, we need to somehow until the macro expander disarms the armed terms again and then go back and interpret the renaming.

The steps to a solution are

  1. Collect all of the relevant actions performed by the macro expander. Many are already collected, like adding/toggling scopes. But arm and disarm steps currently are not gathered.

  2. Devise a new representation of visible terms. See algebra for macro hiding for a solution sketch.

  3. Change the macro stepper to use the new representation. The entirety of the macro stepper depends on this, so this will be a lot of work.