localization support: problem message parser #819

unional · 2023-07-06T18:21:08Z

Request a feature

The maintainers of ArkType will do our best to provide prompt feedback for any feature requests- especially those that are concise and persuasive!

🤷 Motivation

What problem are you having?

The Problem class only expose the formatted message and not the other properties,
meaning the message cannot be localized into other languages.

Why should we prioritize solving it?

Any application supporting i18n probably need this support.

💡 Solution

How do you think we should solve the problem?

There are a few ways to do it.

One way is to simplify the Problem class containing only the data, while there is a overridable problem parser that parse the problem into message.

There are other ways such as accepting a problem factory, exposing the problem data so that application can create their own message parser using the resulting problem instances, etc.

I think the parser approach is probably the best as it properly separate the concerns between problem reporter and message parsing.

The text was updated successfully, but these errors were encountered:

ssalbdivad · 2023-07-06T18:36:17Z

Thanks, this is definitely useful to keep in mind!

The existing Problems class is an array subclass that does have a .summary prop, but every part of the error message is actually customizable. You can also iterate over each problem individually for more fine-grained information.

Note that this API is likely to change in the upcoming release, but we will aim to support the same level of customization.

Here are some examples from our unit tests. mustBe is used to describe the primary condition, then there's a couple other customization options (this is the best to use because it integrates with union errors to form valid messages automatically). writeReason is one level up and adds the rest of the error, then addContext is the most general and writes the whole message including a path.

Here are a couple simple examples from unit tests:

    it("type config", () => {
        const t = type("true", { mustBe: "unfalse" })
        attest(t(false).problems?.summary).snap("Must be unfalse (was false)")
    })
    it("anonymous type config at path", () => {
        const unfalse = type("true", { mustBe: "unfalse" })
        const t = type({ myKey: unfalse })
        attest(t({ myKey: "500" }).problems?.summary).snap(
            "myKey must be unfalse (was '500')"
        )
        // config only applies within myKey
        attest(t({ yourKey: "500" }).problems?.summary).snap(
            "myKey must be defined"
        )
    })
    it("customized builtin problem", () => {
        const types = scope(
            { isEven: "number%2" },
            {
                codes: {
                    divisor: {
                        mustBe: (divisor) => `a multiple of ${divisor}`,
                        writeReason: (mustBe, was) => `${was} is not ${mustBe}!`
                    }
                }
            }
        ).compile()
        attest(types.isEven(3).problems?.summary).snap(
            "3 is not a multiple of 2!"
        )
    })

Here's our internal default config for each problem type so you can see how the different message parts are composed:

const defaultProblemConfig: {
    [code in ProblemCode]: ProblemDefinition<code>
} = {
    divisor: {
        mustBe: (divisor) =>
            divisor === 1 ? `an integer` : `a multiple of ${divisor}`
    },
    class: {
        mustBe: (expected) => {
            const possibleObjectKind = getExactConstructorObjectKind(expected)
            return possibleObjectKind
                ? objectKindDescriptions[possibleObjectKind]
                : `an instance of ${expected.name}`
        },
        writeReason: (mustBe, data) =>
            writeDefaultReason(mustBe, data.className)
    },
    domain: {
        mustBe: (domain) => domainDescriptions[domain],
        writeReason: (mustBe, data) => writeDefaultReason(mustBe, data.domain)
    },
    missing: {
        mustBe: () => "defined",
        writeReason: (mustBe) => writeDefaultReason(mustBe, "")
    },
    extraneous: {
        mustBe: () => "removed",
        writeReason: (mustBe) => writeDefaultReason(mustBe, "")
    },
    bound: {
        mustBe: (bound) =>
            `${Scanner.comparatorDescriptions[bound.comparator]} ${
                bound.limit
            }${bound.units ? ` ${bound.units}` : ""}`,
        writeReason: (mustBe, data) =>
            writeDefaultReason(mustBe, `${data.size}`)
    },
    regex: {
        mustBe: (expression) => `a string matching ${expression}`
    },
    value: {
        mustBe: stringify
    },
    branches: {
        mustBe: (branchProblems) =>
            describeBranches(
                branchProblems.map(
                    (problem) =>
                        `${problem.path} must be ${
                            problem.parts
                                ? describeBranches(
                                      problem.parts.map((part) => part.mustBe)
                                  )
                                : problem.mustBe
                        }`
                )
            ),
        writeReason: (mustBe, data) => `${mustBe} (was ${data})`,
        addContext: (reason, path) =>
            path.length ? `At ${path}, ${reason}` : reason
    },
    multi: {
        mustBe: (problems) => "• " + problems.map((_) => _.mustBe).join("\n• "),
        writeReason: (mustBe, data) => `${data} must be...\n${mustBe}`,
        addContext: (reason, path) =>
            path.length ? `At ${path}, ${reason}` : reason
    },
    custom: {
        mustBe: (mustBe) => mustBe
    },
    cases: {
        mustBe: (cases) => describeBranches(cases)
    }
}

Again, this will likely change soon, so if you do have feedback it would be super useful to get now!

unional · 2023-07-06T19:25:30Z

👍

One concern about i18n support is that the tooling typically behaves in certain way that does not support element composition, similar to tailwindcss.

e.g., formatjs does it the same way as tailwindcss that it parse the source code to extract the messages (in tailwindcs the class name is not composable, you can't do b-${var}, for example).

i.e. it looks at the AST for:

intl.formatMessage({
  defaultMessage: "some message for {name}"
}, {
  name: 'field name'
})

So changing individual element like mustBe would not work for those tools.

unional · 2023-07-06T19:30:54Z

Also another thing about i18n is that element composition generally does not work, even without the parsing AST issue mentioned above.

The reason is that different language have different grammatical order (I can't remember the exact term for this, as I'm not a linguist 😛).

So it would be the best to only keep the data and let the message parser to compose the message however it wants.

ssalbdivad · 2023-07-06T19:31:25Z

Hmm okay, I'll have go dig into this a bit since I haven't worked with these tools.

Currently the input type for those options is:

export type MustBeWriter<code extends ProblemCode> =
    | string
    | ((source: ProblemSources[code]) => string)

export type ReasonWriter<code extends ProblemCode = ProblemCode> = (
    mustBe: string,
    data: DataWrapper<
        code extends keyof ConstrainedRuleTraversalData
            ? ConstrainedRuleTraversalData[code]
            : unknown
    >
) => string

export type ContextWriter = (reason: string, path: Path) => string

If ReasonWriter and ContextWriter also accepted string, could that be used to configure it the way you hope? It is a bit of a footgun for those who define a mustBe then don't use it in addContext or only define an addContext, which would mean that if the error occurred as part of a union it would still have its original unconfigured mustBe message.

ssalbdivad · 2023-07-06T19:34:44Z

So it would be the best to only keep the data and let the message parser to compose the message however it wants.

To be clear, the data is still there. If you iterate over the array of problems, each one will have these props:

code: The kind of problem, e.g. "range"
path: Path at which the problem occurred
data: The exact value associated with the problem
source: More detailed information about the problem based on its code

Could you just use this problems object to create the messages you need?

The reason these other options exist is that writing a good error message for a union is very complex, so we want to give people the option to avoid it by relying on our solutions (or integrating with them). It's unfortunate that wouldn't be compatible with some of the tools you mentioned, but if you're just wanting the raw data describing the nature of the problem, that is already there, so you can do whatever you want with it.

unional · 2023-07-06T20:02:09Z

I think the data and source are private atm: https://github.com/arktypeio/arktype/blob/main/src/traverse/problems.ts#L42-L43

Yes, with that, it will support the exposing the problem data so that application can create their own message parser using the resulting problem instances scenario.

i.e. something like:

export function Component() {
  return <SomeField validate={(v, fields) => {
     const { problems } = someFieldValidator(v)
     if (problems.length > 0) {
       return i18nProblems({ name: 'someField', fields }, problems)
     }
  }}/>
}


function i18nProblems(field: FieldInfo, problems: Problem[]) {
  return problems.map(p => i18nProblem(field, p))
}

function i18nProblem(field: FieldInfo, problem: Problem) {
  switch (problem.code) {
    case 'required':
      return intl.formatMessage(
        { defaultMessage: `{name) is required` },
        field
      )
    case 'min':
      return intl.formatMessage(
        { defaultMessage: `{name) must be more than {min}` },
        { name: field.name, min: problem.data.min }
      )
    ...
  }
}

ssalbdivad · 2023-07-06T20:05:49Z

@unional Whoops, my bad! Not so useful if you can't access them😅

I'll be sure those are exposed for the next release, and the rest of this gives me good context to consider for some other details of this problem. I'll keep this issue open until then to ensure I follow through with that!

unional · 2023-07-06T20:36:48Z

btw, on a separate note, I'm working on some custom parser for my application,
and I am considering this format:

const name = textParser({
  min: [3, (field, value) => format(...)],
})

ssalbdivad · 2023-07-06T20:42:02Z

@unional Could you link it to me if it's open source or otherwise provide a bit of additional context in the example, e.g. what the input/output types are?

That would help me use it as a reference for the kind of solution that would work well for you.

Dimava · 2023-07-10T18:26:02Z

const username = type(
  [ 'string >= 3', 'throws', ({err, data, path, validatorContext, outerContext: { lang }}) => ({ en: 'No!', fr: 'Noes!', ru: 'Нет!' })[lang] ]
)
name(1, { lang: 'en', keysDefault: 'strict', algorithm: 'clone' }) // { sucess: false, error: whatever('No!') }

maybe?

unional · 2023-07-10T19:04:15Z

Hi, sorry missed that message @ssalbdivad.

Could you link it to me if it's open source or otherwise provide a bit of additional context in the example, e.g. what the input/output types are?

Please don't worry about it. It's not open source and I'm just trying to figure out how to do it myself too.

TheOrdinaryWow · 2025-01-26T13:10:31Z

Any update?

ssalbdivad · 2025-01-26T17:06:29Z

My current recommendation would be to map over ArkErrors, each of which has context on the individual error, and transform those to localized messages directly.

I'm interested in further improvements here but I don't have a clear idea of the right API for that yet and would need to do more research on popular i18n libraries.

TheOrdinaryWow · 2025-01-26T17:49:11Z

My current recommendation would be to map over ArkErrors, each of which has context on the individual error, and transform those to localized messages directly.

I'm interested in further improvements here but I don't have a clear idea of the right API for that yet and would need to do more research on popular i18n libraries.

My idea is to maintain locale internally like date-fns, which allow users to specify locale in options.

unional · 2025-01-26T18:03:19Z

My idea is to maintain locale internally like date-fns, which allow users to specify locale in options

That likely wouldn't work, because validation is a logic and thus is dynamic (i.e. depends on context).

Localization in data-fns can be internalized because it describes a value. When translated, all words related to the value stay together and do not vary "in common form".

I put common form in qoutes because there are variations, which such approach wouldn't support.

For example, you can describe number with different characters in Chinese:

1 (one) can be written as 一, or 壹.

TheOrdinaryWow · 2025-01-26T20:40:57Z

My idea is to maintain locale internally like date-fns, which allow users to specify locale in options

That likely wouldn't work, because validation is a logic and thus is dynamic (i.e. depends on context).

Localization in data-fns can be internalized because it describes a value. When translated, all words related to the value stay together and do not vary "in common form".

I put common form in qoutes because there are variations, which such approach wouldn't support.

For example, you can describe number with different characters in Chinese:

1 (one) can be written as 一, or 壹.

What about using a combination of template strings and localized specific error types? eg.

en: must be a ${type} (was ${type})
zh: 需为 ${type} (得到 ${type})

Or to manipulate it via a function like

(name: string, errors: ArkErrors) => string

TheOrdinaryWow · 2025-01-26T20:47:05Z

My current recommendation would be to map over ArkErrors, each of which has context on the individual error, and transform those to localized messages directly.

I'm interested in further improvements here but I don't have a clear idea of the right API for that yet and would need to do more research on popular i18n libraries.

The problem now is that in zod, we can specify each type’s error message easily like

z.object({
  email: z
    .string({
      required_error: t("tips.field_required"),
      invalid_type_error: t("invalid_value"),
    })
    .email(t("msgs.invalid_email"))
    .min(1, t("msgs.incorrect_length"))
    .max(64, t("msgs.incorrect_length"))
})

But in ArkType, I did now see an easy approach, managing error messages is quite complicated.

For me that’s a major reason that stopped me from using ArkType. I respect it provides both "text" and "fluent" method to manage type, meanwhile the shortage of managing "option" or what say "detail" of each type is a bit confusing. From this perspective I think zod did a better job.

In other words, this might not be a "shortage" of ArkType, it’s that when comparing between ArkType and zod, they actually gives different design.

When it comes to zod, it’s actually acceptable to either specify error messages in type’s option or manage i18n globally (I’m a bit annoyed by how I need to set every localized error messages in EVERY type, even though they are almost the same so I’m desperately wanting for a new alternative).

But in ArkType, your type-specifying design only allow you to manage a global localization thing, since every type, if specified in "text" method, as far as I see there’s no concept of "options". So in this situation, the best solution I can imagine is to figure out a way to manage i18n like date-dns does, the only change is to configure it via global option.

unional · 2025-01-26T22:50:43Z

Similar situation here. This is the main reason we can't use arktype.

What you show is one way to achieve that. The bottom line is who has control of the content, and who has control of the architecture.

The library (arktype) accepting a value or callback for the localization controls the architecture and gives control of the content to the consumer.

If it returns a validation result with all information needed for the consumer to process, it gives both control of the architecture and content to the consumer.

IMO the latter is the better approach.
The localization of error message is formatting, i.e. transforming data (error state) to some presentable form (localized string).

The presentation doesn't need to be only strings. What if the consumer wants to have different CSS styles for different errors?

This is essentially related to SRP. Don't take additional responsibility when not necessary.
It is beneficial for arktype to not get into the matter of formatting. It is one less thing to maintain, and it is more flexible.

ssalbdivad · 2025-01-26T22:58:07Z

@unional If I'm understanding correctly, this is already available.

You can introspect data related to the errors in the final ArkErrors result without needing to customize anything via the builtin error configs and build the localized messages however you want.

unional · 2025-01-26T23:07:36Z

You can introspect data related to the errors in the final ArkErrors result without needing to customize anything via the builtin error configs and build the localized messages however you want.

Hi there @ssalbdivad 😊

So seems like the needed data is exposed now? Haven't come back to look into arktype since our last conversation. If it is, feel free to close this issue. 🍻

ssalbdivad · 2025-01-26T23:13:52Z

@unional I actually think that data was always exposed, although ArkError does need some documentation: #1273

I will close this for now in favor of this discussion for additional feedback on how we can best support i18n internally.

github-project-automation bot added this to arktypeio Jul 6, 2023

github-project-automation bot moved this to To do in arktypeio Jul 6, 2023

ssalbdivad moved this from To do to In progress in arktypeio Jul 6, 2023

ssalbdivad moved this from In progress to Backlog in arktypeio Dec 12, 2023

ssalbdivad mentioned this issue Jan 9, 2024

Custom error message #895

Closed

ssalbdivad closed this as completed Jan 26, 2025

github-project-automation bot moved this from Backlog to Done (merged or closed) in arktypeio Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

localization support: problem message parser #819

localization support: problem message parser #819

unional commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023

unional commented Jul 6, 2023

unional commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023 •

edited

Loading

unional commented Jul 6, 2023 •

edited

Loading

ssalbdivad commented Jul 6, 2023

unional commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023

Dimava commented Jul 10, 2023 •

edited

Loading

unional commented Jul 10, 2023

TheOrdinaryWow commented Jan 26, 2025

ssalbdivad commented Jan 26, 2025 •

edited

Loading

TheOrdinaryWow commented Jan 26, 2025

unional commented Jan 26, 2025

TheOrdinaryWow commented Jan 26, 2025 •

edited

Loading

TheOrdinaryWow commented Jan 26, 2025 •

edited

Loading

unional commented Jan 26, 2025 •

edited

Loading

ssalbdivad commented Jan 26, 2025

unional commented Jan 26, 2025

ssalbdivad commented Jan 26, 2025

localization support: problem message parser #819

localization support: problem message parser #819

Comments

unional commented Jul 6, 2023

Request a feature

🤷 Motivation

💡 Solution

ssalbdivad commented Jul 6, 2023

unional commented Jul 6, 2023

unional commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023 • edited Loading

unional commented Jul 6, 2023 • edited Loading

ssalbdivad commented Jul 6, 2023

unional commented Jul 6, 2023

ssalbdivad commented Jul 6, 2023

Dimava commented Jul 10, 2023 • edited Loading

unional commented Jul 10, 2023

TheOrdinaryWow commented Jan 26, 2025

ssalbdivad commented Jan 26, 2025 • edited Loading

TheOrdinaryWow commented Jan 26, 2025

unional commented Jan 26, 2025

TheOrdinaryWow commented Jan 26, 2025 • edited Loading

TheOrdinaryWow commented Jan 26, 2025 • edited Loading

unional commented Jan 26, 2025 • edited Loading

ssalbdivad commented Jan 26, 2025

unional commented Jan 26, 2025

ssalbdivad commented Jan 26, 2025

ssalbdivad commented Jul 6, 2023 •

edited

Loading

unional commented Jul 6, 2023 •

edited

Loading

Dimava commented Jul 10, 2023 •

edited

Loading

ssalbdivad commented Jan 26, 2025 •

edited

Loading

TheOrdinaryWow commented Jan 26, 2025 •

edited

Loading

TheOrdinaryWow commented Jan 26, 2025 •

edited

Loading

unional commented Jan 26, 2025 •

edited

Loading