Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding localization to scripts #641

Closed
rpatters1 opened this issue Jan 16, 2024 · 5 comments
Closed

Adding localization to scripts #641

rpatters1 opened this issue Jan 16, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@rpatters1
Copy link
Collaborator

rpatters1 commented Jan 16, 2024

Automatic Localization Approach

I am contemplating adding the ability to automatically localize scripts in a user's preferred language using the OpenAI API library. To that end, I will introduce in RGP Lua (for 0.71)

  • FCControl:SetAutoResizeWidth (which requests auto-sizing of supported controls before the window is running)
  • FCControl:AssureWidthForText (which you can use while the window is running)

Then I am imagining a new lua library localization. Each script would include global table of all of its strings like this:

localization_base =
{
    ["Hello"] = "Hello",
    ["Goodbye"] = "Goodbye",
}

The library would export a function `'localize(string)' that did the following:

  • Extract the user's preferred 2-character language code from finenv.UI():GetUserLocaleName.
  • Check to see if a global tables exists for ``localization_" .. language_code`. If so, it looks up the input string in that table.
  • If found, return the string.
  • Otherwise, look for localization_base.
  • If found, and if an OpenAI key exists, auto-generate a language table for the user's language. The beauty of this approach is that it only requires a single call to OpenAI to translate all the strings in the script at once.
  • If nothing else works, it returns the input string unchanged.

The script-writer then wraps all hard-coded strings inside calls to localize. It should be fairly simple to write a regex that does this. (Or get ChatGPT to write one for you, which is what I would do.)

Possible Optimizations

Some ways to speed up the script would be

  • Set finenv.RetainLuaState = true so that it would not have to call OpenAI each time you invoked the script.
  • Optionally embed additional languages, e.g.:
localization_es =
{
    ["Hello"] = "Hola",
    ["Goodbye"] = "Adios",
}

localization_jp = 
{
    ["Hello"] = "今日は",
    ["Goodbye"] = "さようなら",
}

Utility Functions

In additional to localize, the library would have a of utility function for developers:

generate_localization(language_code)

This function would search the current running script for all strings and create the language_ table and copy it to the clipboard. The developer could then paste this into the script and provide direct support for a language that way. This function could also be used to generate the _base table. In that case, it would not call OpenAI.

Issues and Concerns

  • LLM hallucinations. LLMs have gotten quite reliably good at language translation, but there will inevitably be mistakes. We probably need a way for a user to disable auto-translation if it a particular language is not well-supported in the LLM.
  • Right-to-left languages. I don't know that anyone has ever tested the PDK Framework dialog box system with right-to-left languages. It probably doesn't work. We need a way to detect that a language is right-to-left and not auto-generate translations for it. Or maybe (at minimum) a way for the user to disable the feature (as above) if it is not generating useful results.
  • Strings overrunning the layout. While auto-sizing the width of individual controls is fairly straightforward, auto-sizing the control layout is beyond the scope of any change I am prepared to make in the PDK Framework. (The current code is fairly opaque.) Dialog boxes will have to be sized and laid out sufficient to the longest versions of the strings. (My suggestion is use Spanish to determine needed layout size. I'm sure there are languages with longer strings, but those are quite long, mainly due to the lack of a direct possessive form.)

I welcome ideas and concerns.

@rpatters1 rpatters1 added the enhancement New feature or request label Jan 16, 2024
@jwink75
Copy link
Collaborator

jwink75 commented Jan 16, 2024 via email

@rpatters1
Copy link
Collaborator Author

rpatters1 commented Jan 16, 2024

LLMs actually do really well with longer text. Shorter text might be more challenging without proper context. I plan to hone the prompt to give it context.

There are 2 potential OpenAI accounts that might exist: the user's and the developer's. Neither is required. Here is the decision tree:

  • Neither has an OpenAI account: the base values are used (i.e., no localization)
  • The Developer has an account but the user does not: the user gets a localization if the developer has included it in the script. Otherwise the user gets base values.
  • The User has an account but the developer does not: the user gets an on-the-fly automatic localization.
  • Both have an account: the user gets an included localization if there is one, otherwise the user gets an on-the-fly automatic localization.

The developer could include a localization from some other source than OpenAI, but this situation would be included in the "developer has OpenAI account" case.

@ThistleSifter
Copy link
Member

Being able to position controls relative to other controls would help this. Eg this text box is always 12 units below that static text. I was toying with the idea of adding this to the control mixins a while ago.

@rpatters1
Copy link
Collaborator Author

I have added a non-overlap function and a horizontal align function. I may add one or two more. It would be a challenge in mixin because to get the proper text width you need the window to be running, but you need to set up control position and size before the opaque auto-layout routines runs. There is (currenty) no way for Lua to hook into the process at the right moment. It could be added, of course, but I think what I'm working on will accommodate most needs on the horizontal axis. I am not tackling the vertical axis.

@rpatters1
Copy link
Collaborator Author

This was closed with #649.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants