Skip to content

Uneven branches in multi-variants #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zbraniecki opened this issue Jul 19, 2020 · 1 comment
Open

Uneven branches in multi-variants #9

zbraniecki opened this issue Jul 19, 2020 · 1 comment

Comments

@zbraniecki
Copy link
Owner

As per @stasm request in #6 I encoded the AST in my proposal to use Option 2.

It handles multi-variant like Anne published 2 pictures. - where in Polish we'll need gender and plural selector.

The issue I see with Option 2 is that I'm not sure how to resolve uneven selectors, For example, if we'd like to extend the example to handle Anne and John published 2 pictures and Anne published 2 pictures, in Polish we'll have to handle the fact that Polish has different genders depending on the plural form of the subject.

  • For singular, we have masculine, feminine and neuter
  • For plural, we have masculine-personal and non-masculine-personal.

In Fluent's proposal we would handle that via nesting:

// John arrived.
// John and Amy arrived.
key = { PLURAL($userNames) ->
    [one] { GENDER($users) -> {
        [masculine] { LIST($userNames) } przyszedł
        [feminine] { LIST($userNames) } przyszła
       *[neuter] { LIST($userNames) } przyszło
    }
   *[other] { GENDER($users) -> {
        [masculine-personal] { LIST($userNames) } przyszli
       *[non-masculine-personal] { LIST($userNames) } przyszły
    }
}

As you can see it is fairly easy to encode the idea of "default" variants and uneven branches.

With Option 2, it becomes more tricky:

key = { PLURAL($userNames), GENDER($users) ->
    [one, masculine] { LIST($userNames) } przyszedł
    [one, feminine] { LIST($userNames) } przyszła
    [one, neuter] { LIST($userNames) } przyszło
    [other, masculine-personal] { LIST($userNames) } przyszli
    [other, non-masculine-personal] { LIST($userNames) } przyszły
}

we can encode it via a single "default":

key = { PLURAL($userNames), GENDER($users) ->
    [one, masculine] { LIST($userNames) } przyszedł
    [one, feminine] { LIST($userNames) } przyszła
    [one, neuter] { LIST($userNames) } przyszło
    [other, masculine-personal] { LIST($userNames) } przyszli
   *[other, non-masculine-personal] { LIST($userNames) } przyszły
}

which is limiting because we may resolve the plural perfectly and only struggle with gender.

Alternatively, we may have default per selector:

key = { PLURAL($userNames), GENDER($users) ->
    [one, masculine] { LIST($userNames) } przyszedł
    [one, feminine] { LIST($userNames) } przyszła
    [one, *neuter] { LIST($userNames) } przyszło
    [*other, masculine-personal] { LIST($userNames) } przyszli
    [*other, *non-masculine-personal] { LIST($userNames) } przyszły
}

but that looks clunky.

There may be some other way to encode what are the defaults, like separately denote defaults, but they seem increasingly clunky to encode in human readable and consistent way.

I'm opening this issue with three thoughts:

  • Multi-variants message design heavily depend on our decisions about default values
  • Multi-selector message design heavily depend on our decisions about uneven branches and defaults
@mihnita
Copy link

mihnita commented Aug 5, 2020

In MessageFormat other means default.
Trouble is, this is not limited to MessageFormat, it goes all the way to Plurals, and CLDR:
https://github.com/unicode-org/cldr/blob/master/common/supplemental/plurals.xml

You can see that the other entries only have examples, no rules.
That's be cause "if none of the rules apply, then return other"


So if we think of this as a switch:

switch (getPlural(locale, count)) {
   case one: ...
   case few: ...
   ...
   default: ... // this is the same as case "other"
}

I think it is a good thing to have one and only one value as fallback, and that should be as generic as possible (covering all options)

  1. Using * would mean that translators should be the ones moving it around, depending what their language prefers (some languages default to neuter, some to masculine, etc.)
    That can mess up localization tools, leveraging, and adds extra complexity for translators.

  2. Does not match the "mental model" a programmer has about "the world":

switch (PLURAL($userNames)) {
    [one] ...
    [few] ...
    *[many] ... // WAT? https://www.destroyallsoftware.com/talks/wat :-)
    [other] ...
}

Which one is the default now? "many", because the * says so, or "other", because CLDR (which is Unicode) says so?
Remember: "other" in CLDR plural means the same thing as "default" in programming languages switch


TLDR: I'm trying to make a case for:

key = { PLURAL($userNames), GENDER($users) ->
    [one, masculine] ...
    [one, feminine] ...
    [one, other] ...
    [other, masculine] ...
    [other, other] ... // This is the default, and the only default
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants