Skip to content

Adding localization to scripts #641

Closed
@rpatters1

Description

@rpatters1

Automatic Localization Approach

I am contemplating adding the ability to automatically localize scripts in a user's preferred language using the OpenAI API library. To that end, I will introduce in RGP Lua (for 0.71)

  • FCControl:SetAutoResizeWidth (which requests auto-sizing of supported controls before the window is running)
  • FCControl:AssureWidthForText (which you can use while the window is running)

Then I am imagining a new lua library localization. Each script would include global table of all of its strings like this:

localization_base =
{
    ["Hello"] = "Hello",
    ["Goodbye"] = "Goodbye",
}

The library would export a function `'localize(string)' that did the following:

  • Extract the user's preferred 2-character language code from finenv.UI():GetUserLocaleName.
  • Check to see if a global tables exists for ``localization_" .. language_code`. If so, it looks up the input string in that table.
  • If found, return the string.
  • Otherwise, look for localization_base.
  • If found, and if an OpenAI key exists, auto-generate a language table for the user's language. The beauty of this approach is that it only requires a single call to OpenAI to translate all the strings in the script at once.
  • If nothing else works, it returns the input string unchanged.

The script-writer then wraps all hard-coded strings inside calls to localize. It should be fairly simple to write a regex that does this. (Or get ChatGPT to write one for you, which is what I would do.)

Possible Optimizations

Some ways to speed up the script would be

  • Set finenv.RetainLuaState = true so that it would not have to call OpenAI each time you invoked the script.
  • Optionally embed additional languages, e.g.:
localization_es =
{
    ["Hello"] = "Hola",
    ["Goodbye"] = "Adios",
}

localization_jp = 
{
    ["Hello"] = "今日は",
    ["Goodbye"] = "さようなら",
}

Utility Functions

In additional to localize, the library would have a of utility function for developers:

generate_localization(language_code)

This function would search the current running script for all strings and create the language_ table and copy it to the clipboard. The developer could then paste this into the script and provide direct support for a language that way. This function could also be used to generate the _base table. In that case, it would not call OpenAI.

Issues and Concerns

  • LLM hallucinations. LLMs have gotten quite reliably good at language translation, but there will inevitably be mistakes. We probably need a way for a user to disable auto-translation if it a particular language is not well-supported in the LLM.
  • Right-to-left languages. I don't know that anyone has ever tested the PDK Framework dialog box system with right-to-left languages. It probably doesn't work. We need a way to detect that a language is right-to-left and not auto-generate translations for it. Or maybe (at minimum) a way for the user to disable the feature (as above) if it is not generating useful results.
  • Strings overrunning the layout. While auto-sizing the width of individual controls is fairly straightforward, auto-sizing the control layout is beyond the scope of any change I am prepared to make in the PDK Framework. (The current code is fairly opaque.) Dialog boxes will have to be sized and laid out sufficient to the longest versions of the strings. (My suggestion is use Spanish to determine needed layout size. I'm sure there are languages with longer strings, but those are quite long, mainly due to the lack of a direct possessive form.)

I welcome ideas and concerns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions