Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map dereferencing grammar feature #1

Open
maetl opened this issue Jun 7, 2021 · 0 comments
Open

Map dereferencing grammar feature #1

maetl opened this issue Jun 7, 2021 · 0 comments

Comments

@maetl
Copy link
Collaborator

maetl commented Jun 7, 2021

From the original design docs:

Map substitutions are bidirectional. The direction of match to target can be changed between left to right and right to left by flipping the < and > applicators.

Example:

calyx.grammar({
  start: "{@animal} {verb} {@animal>posessive} {appendage}",
  animal: ["Snowball", "Santa’s Little Helper"]
  posessive: {
    "Snowball": "her",
    "Santa’s Little Helper": "his"
  },
  verb: ["chases", "licks", "bites"],
  appendage: ["tail", "paw"]
})

Extension to productions

Grammar rules will need to support objects with string key/values representing a PairedMap or Dictionary production, as an alternative to Choice and WeightedChoice productions.

The parsing algorithm branching in (whatever is currently used for) build_ast needs to change to check for this new shape and create the production:

  class Rule
    def self.build_ast(productions, registry)
      if productions.first.is_a?(Hash)
        # TODO: test that key is a string

        if productions.first.first.last.is_a?(String)
          # If value of the production is a strings then this is a
          # paired mapping production.
          Syntax::PairedMapping.parse(productions.first, registry)
        else
          # Otherwise, we assume this is a weighted choice declaration and
          # convert the hash to an array
          Syntax::WeightedChoices.parse(productions.first.to_a, registry)
        end
      elsif productions.first.is_a?(Enumerable)
        # TODO: this needs to change to support attributed/tagged grammars
        Syntax::WeightedChoices.parse(productions, registry)
      else
        Syntax::Choices.parse(productions, registry)
      end
    end

New syntax nodes

AST node for PairedMapping which needs to be lexed by hitting < and > symbols in the same position as the current . for modifier dereferencing.

The lookup itself is a unary function that applies either lhs to rhs or rhs to lhs key/value lookups. The Ruby version uses a modified radix tree dictionary with custom rules for skipping and concatenating wildcard matches.

This could probably be optimised a lot better, but the question right now is whether to optimise for performance or maintenance. There are a few details of the Ruby prototype that feel overengineered and could be compacted (like having separate node and edge structs and storing data on the edges, which feels more baroque than it needs to be).

Test cases

  describe 'wildcard match' do
    let(:paired_map) do
      Calyx::Syntax:: PairedMapping.parse({
        "%y" => "%ies",
        "%s" => "%ses",
        "%" => "%s"
      }, registry)
    end

    specify 'lookup from key to value' do
      expect(paired_map.value_for('ferry')).to eq('ferries')
      expect(paired_map.value_for('bus')).to eq('buses')
      expect(paired_map.value_for('car')).to eq('cars')
    end

    specify 'lookup from value to key' do
      expect(paired_map.key_for('ferries')).to eq('ferry')
      expect(paired_map.key_for('buses')).to eq('bus')
      expect(paired_map.key_for('cars')).to eq('car')
    end
  end

Test case informing patterns and guidance for authors

Contradictory/recursive logic that some people might attempt (how to warn, what expectations to set, etc):

{
  author: {
    "Doris": "she",
    "Philip": "he",
    "Sam": "they"
  },
  start: "NAME: {@author}, PRONOUN: {@author>author}"
}

Need to document the full bidirectional matrix:

{
  author: ["Doris", "Philip", "Sam"],
  pronoun: ["he", "she", "they"]
  preferred: {
    "Doris": "she",
    "Philip": "he",
    "Sam": "they"
  },
  leftToRight: "NAME: {@author}, PRONOUN: {@author>preferred}",
  rightToLeft: "NAME: {@pronoun<preferred}, PRONOUN: {@pronoun}"
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant