Changes in response to S. Farrell comments

Clarifying language on rules, collision checking and well-formed documents, incl. required ordering.
kjd · Apr 28, 2016 · e7aa58f · e7aa58f
1 parent 9838ed5
commit e7aa58f
Showing 1 changed file with 52 additions and 27 deletions.
diff --git a/draft-ietf-lager-specification.xml b/draft-ietf-lager-specification.xml
@@ -166,7 +166,8 @@
 
         <section title="LGR Format">
 
-            <t>An LGR is expressed as a well-formed XML Document <xref target="XML"/>.</t>
+            <t>An LGR is expressed as a well-formed XML Document <xref target="XML"/> 
+            that conforms to the schema defined in <xref target="schema"/>.</t>
 
             <t>As XML is case-sensitive, an LGR must be authored with the correct
                 casing. For example, the XML element names must be in lower
@@ -221,6 +222,10 @@
                 contain zero or one "meta" element, exactly one "data" element, and
                 zero or one "rules" element; and these three elements MUST be in that order.</t>
 
+                 <t>Most elements that are direct or nested child elements of the "rules" element
+                     MUST be placed in a specific relative order to other elements for the LGR to be valid.
+                    An LGR that violates these constraints MUST be rejected. </t>
+
                 <t>In the following descriptions, required, non-repeating elements or attributes are
                     generally not called out explicitly, in contrast to "OPTIONAL" ones,
                     or those that "MAY" be repeated. For attributes that take lists as values, the elements MUST be
@@ -443,7 +448,9 @@
                     <t>
                         <figure>
                             <artwork><![CDATA[    <references>
-      <reference id="0">The Unicode Standard, Version 7.0</reference>
+      <reference id="0">The Unicode Consortium. The Unicode Standard, Version 8.0.0, 
+        (Mountain View, CA: The Unicode Consortium, 2015. ISBN 978-1-936213-10-8)
+        http://www.unicode.org/versions/Unicode8.0.0/</reference>
       <reference id="1">Big-5: Computer Chinese Glyph and Character
          Code Mapping Table, Technical Report C-26, 1984</reference>
       <reference id="2" comment="synchronized with Unicode 6.1">
@@ -477,7 +484,7 @@
 
             <t>The code point data is collected within the "data" element. Within this element, a
                 series of "char" and "range" elements describe eligible code points, or ranges of
-                code points, respectively.</t>
+                code points, respectively. Collectively, these are known as the repertoire.</t>
 
             <t>Discrete permissible code points or code point sequences (see
                <xref target="sequences" />) are declared with a "char"
@@ -670,7 +677,9 @@
                    </t>
                     <t>Variant relations are normally not only symmetric, but also transitive.
                       If A is a variant of B and B is a variant of C, then A is also a variant of C. 
-                      As with symmetry, these transitive relations are spelled out explicitly in the LGR.</t>
+                      As with symmetry, these transitive relations are only part of the LGR if
+                      spelled out explicitly. Implementations that require an LGR to be symmetric
+                      and transitive should verify this mechanically.</t>
 
                     <t>All variant mappings are unique. For a given "char" element all "var" elements
                         MUST have a unique combination of "cp", "when" and "not-when" attributes.
@@ -968,15 +977,23 @@
         <section title="Whole Label and Context Evaluation">
 
             <section title="Basic Concepts">
-                <t>The code points in a label sometimes need to satisfy context-based rules, for
-                    example for the label to be considered valid, or to satisfy the context for a
-                    variant mapping (see the description of the "when" attribute in <xref
-                        target="parameterized_context_rule"/>).</t>
+                <t>The "rules" element contains the specification of both context-based and whole
+                      Whole Label Evaluation (WLE) rules (<xref target="whole_label" />), the character 
+                      classes (<xref target="character_classes" />) that they depend on
+                      and any actions (<xref target="actions"/>) that assign dispositions to labels
+                      based on rules or variant mappings.</t>
+
                 <t>A Whole Label Evaluation rule (WLE) is applied to the whole label. It is used to
-                    validate both original labels and variant labels computed from them using a
-                    permutation over all applicable variant mappings. A conditional context rule is
-                    a specialized form of WLE specific to the context around a single code point or
-                    code point sequence. For example, if a rule is referenced in the "when"
+                    validate both original labels and any variant labels computed from them. </t>
+
+                <t>A conditional context rule does not necessarily
+                    apply to the whole label, but may be specific to the context around a single code 
+                    point or  code point sequence. Certain code points in a label sometimes need to 
+                    satisfy context-based rules, for example for the label to be considered valid, or 
+                    to satisfy the context for a variant mapping (see the description of the "when" 
+                    attribute in <xref target="parameterized_context_rule"/>). </t>
+
+                <t>For example, if a rule is referenced in the "when"
                     attribute of a variant mapping it is used to describe the conditional context
                     under which the particular variant mapping is defined to exist.</t>
 
@@ -999,7 +1016,7 @@
                          all of the constraints defined here are validated by the schema.</t>
             </section>
 
-            <section title="Character Classes">
+            <section title="Character Classes" anchor="character_classes">
                 <t>Character classes are sets of characters that often share a particular property.
                     While they function like sets in every way, even supporting the usual set
                     operators, they are called character classes here in a nod to the use of that
@@ -2071,7 +2088,8 @@
                     </list>
                </t>
                 <t>The number of potential permutations can be very large. In practice, implementations
-                    would use suitable optimizations to avoid having to actually create all permutations.</t>
+                    would use suitable optimizations to avoid having to actually create all permutations 
+                    (see <xref target="collision" />). </t>
 
          <t>In determining the permuted set of variant labels in step (1) above, all eligible 
            partitions into sequences must be evaluated. A label "ab" that matches a sequence "ab"
@@ -2179,19 +2197,25 @@
 
                <t>Because of symmetry and transitivity, all variant mappings form disjoint sets. 
                  In each of these sets, the source and target of each mapping are also variants 
-                 of the sources and targets of all the other mappings. As a consequence, if two labels
-                 have code points at the same position from two different of these variant mapping sets,
-                 the sets of their variant labels are likewise disjoint.</t>
-
-              <t>Instead of generating all permutations, that is, using each variant mapping in each
-                set at a particular code position in the label, it is sufficient to substitute an "index" mapping,
-                in effect identifying the set of variant code points for that position. Such an index mapping 
-                could be, for example, the variant mapping for which the target code point (or sequence) 
-                comes first in some sorting order.</t>
+                 of the sources and targets of all the other mappings. However, members of 
+                 two different sets are never variants of each other.</t>
+
+             <t>If two labels have code points at the same position that are members of two 
+                 different of these variant mapping sets, any variant labels of one, cannot be 
+                 variant labels of the other:  the sets of their variant labels are likewise disjoint.
+                 Instead of  generating all permutations to compare all possible variants, it is
+                 enough to find out whether code points at the same position belong to the
+                 same variant set or not.</t>
+
+              <t>For that, it is sufficient to substitute an "index" mapping that identifies the
+                set. This index mapping could be, for
+                example, the variant mapping for which the target code point (or sequence) 
+                comes first in some sorting order. This index mapping would, in effect, identify
+                the set of variant mappings for that position. </t>
 
               <t>To check collision then means generating a single variant label from the original
-                by substituting the "index" value as the target for mapping from any code
-                point. This results in an "index label". Two labels collide whenever the index labels 
+                by substituting the respective "index" value for each code point. This results in an 
+                "index label". Two labels collide whenever the index labels 
                 for them are the same.</t>
              </section>
 
@@ -2955,7 +2979,7 @@ U+6F27;U+4E7E;U+6F27;U+4E81,U+5E72,U+5E79,U+69A6]]></artwork>
            </t>
         </section>
 
-        <section title="RelaxNG Compact Schema">
+        <section title="RelaxNG Compact Schema" anchor="schema">
             <figure>
                 <artwork><![CDATA[
 <CODE BEGINS>
@@ -3152,7 +3176,8 @@ U+6F27;U+4E7E;U+6F27;U+4E81,U+5E72,U+5E79,U+69A6]]></artwork>
                     <list style="hanging" hangIndent="5">
                         <t hangText="draft-ietf-lager-specification-12">
                               Integrate additional feedback from AD review. Use domain names for the prefixes
-                              in private dispositions to reduce potential conflicts.
+                              in private dispositions to reduce potential conflicts. Add clarifying language on 
+                              ordering, well-formedness, collision checking and rules.
                        </t>
                     </list>