|
| 1 | +# 7 April 2025 | MessageFormat Working Group Teleconference |
| 2 | + |
| 3 | +Attendees: |
| 4 | + |
| 5 | +- Addison Phillips \- Unicode (APP) \- chair |
| 6 | +- Ujjwal Sharma \- Igalia (USA) |
| 7 | +- Baha Bouali |
| 8 | +- Daniel Gleckler |
| 9 | +- Eemeli Aro \- Mozilla |
| 10 | +- Richard Gibson \- OpenJSF |
| 11 | +- Shane Carr \- Google |
| 12 | +- Tim Chevalier \- Igalia |
| 13 | +- |
| 14 | + |
| 15 | + |
| 16 | +**Scribe:** USA, APP |
| 17 | + |
| 18 | +## Topic: Info Share, Project Planning |
| 19 | + |
| 20 | +APP: Presented to CLDR TC talked about chartering and rechartering, plans to attend the next ICU TC meeting for the same. |
| 21 | + |
| 22 | +## Topic: PR Review |
| 23 | + |
| 24 | +*Timeboxed review of items ready for merge.* |
| 25 | + |
| 26 | +| PR | Description | Recommendation | |
| 27 | +| ----- | ----- | ----- | |
| 28 | +| \#1067 | Semantic skeletons design | Discuss (but probably premature) | |
| 29 | +| \#1066 | Make the Default Bidi Strategy required and default | Discuss | |
| 30 | +| \#1065 | Draft new charter and goals for v49/v50 and beyond | Discuss, Agenda+ | |
| 31 | +| \#1064 | Rebranding Unicode MessageFormat | Discuss | |
| 32 | +| \#1063 | Fix test tags documentation | Merge | |
| 33 | + |
| 34 | +## Topic: Rechartering and Goals (\#1051) and Rebranding (\#1064) |
| 35 | + |
| 36 | +*We need to set goals for the working group since we’ve partly or wholly disposed of the ones we had. To that end, Addison has drafted new goals/charter. He presented these to CLDR-TC, asking for feedback. Let’s review:* |
| 37 | +[https://github.com/unicode-org/message-format-wg/issues/1051](https://github.com/unicode-org/message-format-wg/issues/1051) |
| 38 | +[https://github.com/unicode-org/message-format-wg/pull/1065](https://github.com/unicode-org/message-format-wg/pull/1065) |
| 39 | +[https://github.com/unicode-org/message-format-wg/blob/aphillips-draft-charter/docs/goals.md](https://github.com/unicode-org/message-format-wg/blob/aphillips-draft-charter/docs/goals.md) |
| 40 | + |
| 41 | +BAH: What is the relationship between Unicode and MessageFormat? How does it interact with Unicode? |
| 42 | + |
| 43 | +APP: The Unicode Consortium is an industry SDO of which the MessageFormat WG is part of. We’re part of the CLDR TC’s world and not directly related to the character encoding standard. We chose to call this format Unicode MessageFormat to distinguish it from ICU MessageFormat. |
| 44 | + |
| 45 | +USA: did you get ahold of Luca? |
| 46 | + |
| 47 | +APP: still pending |
| 48 | + |
| 49 | +USA: \+1 to this change. |
| 50 | +— |
| 51 | +APP: Invite folks to review the rendered goals doc (third link above). Support for \<...\> might just be the wrong shape for a goal since we just want to encourage adoption and having more of them would be a metric and not a goal. |
| 52 | + |
| 53 | +EAO: I left a comment where you introduced a goal to promote adoption by moving every feature in ICU MF to stable. I think we need to qualify that. |
| 54 | + |
| 55 | +APP: No, I haven't changed that yet. Should we put something like “all necessary functions”? |
| 56 | + |
| 57 | +EAO: We can provide a strategy for how to get ICU MF messages ported to Unicode MF and if there are any that are unsupported then we should explicitly say as much. |
| 58 | + |
| 59 | +USA: supporting EAO’s point. The wording you have doesn’t support our goal exactly but could lead to unintended consequences but we’re on the same page, things from icu mf that shouldn’t make the cut, so just spell out and this way there would be no misinterpretation |
| 60 | + |
| 61 | +APP: Fair, will make that change. |
| 62 | + |
| 63 | +EAO: Will we need to refer to something? MF 1.0 for numbers and date times allows microsyntax or skeleton values. |
| 64 | + |
| 65 | +APP: Classical skeletons and picture strings. |
| 66 | + |
| 67 | +EAO: The options we’ll end up with “will support a subset of these features expressible” |
| 68 | + |
| 69 | +APP: It will make it impossible to do some things that you shouldn’t be doing anyway. |
| 70 | + |
| 71 | +EAO: FOr my libraries I’ve written a parser in the past for supporting these in the Intl formats and we have support for input strings but since they’re a subset of the whole is there a way to express these picture strings in a format that would be acceptable in MF2? |
| 72 | + |
| 73 | +APP: People do all sorts of things with picture strings which are not going to be supported. |
| 74 | + |
| 75 | +USA: in this context, decided MF formatters would not crash and fail on invalid imput for this kind of reason. Warn user in translation layer in the package. Essential understood that the data you pass might not look specifically like a thing. MF1=\>UMF the thing i was doing with a picture string, have to edit this message. |
| 76 | + |
| 77 | +APP: Fair, we should table the date time discussion for when we discuss this. There is a set of features that have existed in the Java MF space like simple date format since time immemorial that we aren’t providing but people might want that, they might write their own but we won’t be making anyone provide that. We should deliver the basic set from \#48 but we shouldn’t paint (?) ourselves into a corner and have to levitate out of there. Any thoughts? |
| 78 | + |
| 79 | +## Topic: \#1063 |
| 80 | + |
| 81 | +APP: Any objections to this? |
| 82 | +\*No objections raised\* |
| 83 | + |
| 84 | +## Topic: Semantic Skeletons |
| 85 | + |
| 86 | +*Reserving time to discuss the design.* |
| 87 | + |
| 88 | +[https://github.com/unicode-org/message-format-wg/pull/1067](https://github.com/unicode-org/message-format-wg/pull/1067) |
| 89 | + |
| 90 | +## Topic: Percent Formatting (\#956) |
| 91 | + |
| 92 | +*Reserving time to discuss whether to go with \`:percent\` or whether to use \`:unit unit=percent\` and how to handle percents if unscaled.* |
| 93 | + |
| 94 | +APP: We currently have percentage as part of the unit formatter. EAO had to dodge out, his concern was for :unit unit=percent doesn’t scale the number. A :percent function would scale the number. :math was proposed as well. There is no concrete proposal at the moment for how to add that so that’s the current state. |
| 95 | + |
| 96 | +GLA: Do we know what the concern with the scaling was? Was it just backwards compatibility or that it would be more difficult to do it one way or another? |
| 97 | + |
| 98 | +APP: On the one hand, some existing formatters prefer to do scaling for you and so people who expect that would like to have percent formatting to do the scaling for you. The assumption is that 1 implies 100%. The other argument is that for :unit 1 with unit=percent is 1%. The question is which approach we should take and decide that which works best. |
| 99 | + |
| 100 | +USA: curious why it was decided that, to be more specific, the scaling in the :unit formatter. Is there precedent? My preference would be that two ways to do this would lead to more confusion. If we can provide with/without, but the caveat be that it be quite obvious to the user which is which. Alternative would be to have both and it not be clear, requiring the user to read the docs. In which case better to do one. So with(out) scaling, better to do once and just do that. Math is bad, unless it is general purpose. Fine for the unit value to have an implied scaling because lots of other units have implied scaling. |
| 101 | + |
| 102 | +SFC: I think that percents are a fairly common use case, they have been in ICU and ECMA for a long time, having them in a separate function is motivated. I’m not yet convinced that having unit is required only because it requires a lot of data… We should do the more common thing instead which is percent formatting. |
| 103 | + |
| 104 | +APP: If you choose to implement :unit then we make the assertions but it’s not mandatory. It requires people to do a lot of work in order to get percents. We also have currency which |
| 105 | + |
| 106 | +USA: wanted to express a moderate preference to special case things that are not going to match the most generic unit. Shane noted percent special. Why include things that have a specific path for doing this which should be the recommended path. Why do in unit format. We have limited data for some things. Catch-all formatter that can do all units. Keep unit for generic |
| 107 | + |
| 108 | +GLA: I agree with you except I can see how percent would also be useful as a unit in an optional unit formatter. If you’re doing math type things you would do 0.1 to percent, but if you’re doing more generic things you could simply format it by attaching a percent sign. |
| 109 | + |
| 110 | +APP: For the currency formatter, currencies are also units for historic reasons not because we concluded that it was a great idea. The second thing is that we can fix the scaling thing is by proving an option. If we were to do :math, you would want to do a good job by giving an ergonomic API for generic math operations. |
| 111 | + |
| 112 | +USA: might have a scale option; if have a more privileged path and then a generic one, I wouldn’t know which to use, if I came to it cold. Might be hard for me to ever learn that and one would struggle to remember that. If some slight ergonomic reduction. Make the code look less “great” because lots of different functions. Easier to understand. That way you know this is a percent annotation… this is what it does. Similar to option for scaling. Now you can read and tell what exactly what it does. Still tricky to communicate the default to them. Doesn't magically solve the problem. More explicit we cn be, the easier in the long run. |
| 113 | + |
| 114 | +APP: I agree and I think this relates to the discussion we had last week about semantic skeletons. They are a small number of clearly documented set of options. |
| 115 | + |
| 116 | +GLA: Is there a bias towards percent? |
| 117 | + |
| 118 | +USA: go back and check. Talk to translators, someone less technical. Had the feeling that percent is fairly universal. Not necessarily english speakers. People know what percent is. If you have %value \== x, for the most point people know what this is. Want to know from someone outside what they would think |
| 119 | + |
| 120 | +APP: I think people do and it’s relatively common to say “30% off the price”. Percentages are very common in the real world. From the perspective of a company I work with, I get that they’re very common things. CLDR has per-mille. I won’t want to make a function for that but a shorthand makes sense like for currency. The next step would be to make a design doc. I want to lay it down so that once we make a decision it’s well documented. |
| 121 | + |
| 122 | +GLA: If only to point back at it and remember why we came to a certain decision. |
| 123 | + |
| 124 | + |
| 125 | +## Topic: Inflection Support |
| 126 | + |
| 127 | +*Discussion of proposals for inflection support and next steps.* |
| 128 | + |
| 129 | +Baha sent us this proposal: [https://docs.google.com/document/d/1ByapCVm0Fge\_X3oPAi8NHtJl03ZFMj-NjXxgmAgJBaM/edit?usp=sharing](https://docs.google.com/document/d/1ByapCVm0Fge_X3oPAi8NHtJl03ZFMj-NjXxgmAgJBaM/edit?usp=sharing) |
| 130 | + |
| 131 | +APP: Would you like to take us through this? |
| 132 | + |
| 133 | +BAH: I have some questions. AFter many discussions, we realized that inflections are for unicode and messageformat would only provide the syntax/format. If I want to expand some features would it be on the unicode/cldr side or in MessageFormat? The second point was to thank EAO for their feedback. If you would like me to provide more examples, I’d love to do that. |
| 134 | + |
| 135 | +APP: There is an inflection working group that is working to collect data in this area. Apple in particular has invested a lot of IP in this area. The idea is that you can provide a sentence and it can reinflect the sentence to reflect those rules. A way to think about MessageFormat is for a way for people like translators to manually perform inflections by having selectors and providing it in patterns separately. One way we do this atm is through pluralization but it’s not the only kind of inflection, in fact there’s more complex kinds of inflection. There would be a synergy between them because we have patterns but inflection implies less patterns and the machine would handle inflection. The organizational issue is how to achieve things. |
| 136 | + |
| 137 | +EAO: One way to think about this is the think of Message as an atom and a message needs some data regarding how to be formatted. I need more info about inflection and the engine the WG is working on in terms of input and output. Part of the work here is to maybe modify that API so it works well with MessageFormat. The syntax is going to provide a frontend to the inflection engine. It’s going to provide some capability… but what that API looks like is a development question here. |
| 138 | + |
| 139 | +APP: MessageFormat does two things and one of them is pattern selection. Patterns not messages would be what the inflection engine would work on. The question is whether it’s a thing when they’re doing that. |
| 140 | + |
| 141 | +EAO: Also good to recognize that the engine comes from Apple originally. My understanding is that their approach to MessageFormatting is to use inflection over selection. The inflection engine might provide an alternative to this whole mental model. |
| 142 | + |
| 143 | +APP: We need to know more about how the inflection engine would work to be able to go down that path. I would make a distinction, EAO points out how we use selection for things where inflection could reduce the set of static patterns but special cases would still exist. The question is what people would need to know in order to make it work. Would people need to understand some grammar or would it be a somewhat magical box that would accept a string. |
| 144 | + |
| 145 | +BAH: You are …, it seems like the inflection effort would be in Unicode so based on what you said I’d need to work with the folks in Unicode to get any changes in. Since it’s donated by Apple and it’s mainly for Siri, I think it’s huge and it does a lot of important work but I think the feature set should be sufficient. These are my assumptions however. |
| 146 | + |
| 147 | +EAO: When you say Unicode do you mean the Unicode Inflection group? Because the Inflection WG is what the important bit here. |
| 148 | + |
| 149 | +GLA: It’s fair to say at this moment that the inflection WG’s work will inform the messageformat wg’s deliverables. It’ll be up to this group to decide how the inflection engine would integrate with messageFormat. |
| 150 | + |
| 151 | +APP: We need to understand their expectations, what it does and what the interface is like. We’re both solving the same problem but from different angles maybe. Ours is more geared towards static strings. In a world in which you can compute grammatical matches. Some constrained devices might not be able to do inflection while they can perform number matching. |
| 152 | + |
| 153 | +EAO: Inflection requires locale data and we need to be able to communicate from the data given from inflation how to convert it into data that prompts the translator to express that through strings. |
| 154 | + |
| 155 | +GLA: Will this data live in CLDR? |
| 156 | + |
| 157 | +APP: It’ll live somewhere in the Unicode Consortium, I can’t say for sure about CLDR. |
| 158 | + |
| 159 | +BAH: To build on what you said, for the next time am I supposed to have more examples? What should I clarify in future meetings? |
| 160 | + |
| 161 | +EAO: I think having a better idea of how the design of the inflection engine is shaping up. |
| 162 | + |
| 163 | +APP: Premature for us to design already, believe that it’s too late for 48, not to say that we shouldn’t start working on this already. But we should understand the things EAO mentioned earlier in order to design what the interaction is like. |
| 164 | + |
| 165 | +## Topic: Issue review |
| 166 | + |
| 167 | +[https://github.com/unicode-org/message-format-wg/issues](https://github.com/unicode-org/message-format-wg/issues) |
| 168 | + |
| 169 | +Currently we have 34 open (was 34 last time). |
| 170 | + |
| 171 | +* 22 are tagged for 48 |
| 172 | +* 3 are tagged “Future” |
| 173 | +* 13 are `Preview-Feedback` |
| 174 | +* 2 are tagged Feedback |
| 175 | +* 1 is `resolve-candidate` and proposed for close. |
| 176 | +* 2 are `Agenda+` and proposed for discussion (see below) |
| 177 | +* 0 are ballots |
| 178 | + |
| 179 | +| Issue | Description | Recommendation | |
| 180 | +| ----- | ----- | ----- | |
| 181 | +| \#1043 | Deployment, development, and maintenance of messageformat.unicode.org | Discuss | |
| 182 | +| \#1051 | Plans for v48 | Discuss | |
| 183 | + |
0 commit comments