Skip to content

language stem should respect langMatches semantics #71

@VladimirAlexiev

Description

@VladimirAlexiev

The following shape:
:SpanishProduct { schema:label [ @es~ ] }
Declares that products must have a label in Spanish or any variant of it (eg es-ES vs es-AR).

But LanguageStem is defined as simple prefix match (http://shex.io/shex-semantics/#nodeIn):

s is a LanguageStem and n is a language-tagged string with a language tag l
and fn:starts-with(l, st)

It has these defects:

  • it will match language "Carro"@ese where ese is Ese Ejja, and I don't think those people got cars ;-)
  • it won't match "Carro"@ES but lang tags are defined to be case-insensitive.
  • (instead of st should refer to s)

Instead of simple prefix match, it should comply with https://www.w3.org/TR/sparql11-query/#func-langMatches semantics. RFC4647 defines tags for lang, script, dialect, region etc etc; and that it's case-insensitive. Assuming s doesn't end in - and assuming . represents concat, it can be defined eg like:
regex (l, "(^".s."$)|(^".s."-)", "i")
Note: a simpler regex would be "^".s."($|-)" but I don't believe the last part of it is valid.

Aside: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry is a bit unreadable. The script https://gist.github.com/VladimirAlexiev/8733439 turns it into this more readable google sheet

TEST: @ericprud gave this example URL. For me, it doesn't load the test on first load (or control-shift-R) but loads it on second refresh (control-R):
http://rawgit.com/shexSpec/shex.js/master/doc/shex-simple.html?schema=%3CS%3E%20%7B%20%3Cp%3E%20%5B%40aa~%5D%20%7D&data=%3Cexact%3E%20%3Cp%3E%20%22exact%22%40aa%20.%0A%3Csub%3E%20%3Cp%3E%20%22sub%22%40aa-ES%20.%0A%3CshouldFail%3E%20%3Cp%3E%20%22shouldFail%22%40aaa-ES%20.%0A&shape-map=%7BFOCUS%20%3Cp%3E%20_%7D%40%3CS%3E

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions