-
Notifications
You must be signed in to change notification settings - Fork 341
Support for Chinese Characters #14934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mergify
merged 35 commits into
develop
from
wip/adr/utf8-friendly-getwindowsdocumentspath
Apr 16, 2026
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
b7fbc1e
powershell version
AdRiley 485f14b
logging
AdRiley cd9dd23
fix
AdRiley 0d99e30
fix
AdRiley 3e44d2b
pretty
AdRiley b485036
more logs
AdRiley df07290
jvm
AdRiley 1923370
Switch to wide versions of windows APIs
AdRiley 5fee9d3
Revert "Switch to wide versions of windows APIs"
AdRiley 04091fc
Revert "jvm"
AdRiley 4fc3ae2
Try base64 encoding
jdunkerley 66786c7
Minimal powershell - does it work?
jdunkerley dbacec7
Does need the encoding bit.
jdunkerley decb222
Reapply "Switch to wide versions of windows APIs"
jdunkerley b34ac96
Possibly working with W version...
jdunkerley 4146ba9
Add unicode tests.
jdunkerley b4d2584
Use Unicode function to get command line arguments.
jdunkerley b3fdb6d
WIP
jdunkerley 387d0c3
First fully working version!
jdunkerley a5d617e
Remove some of the diagnostics.
jdunkerley 6593697
Move diagnostics in Java to Logger.
jdunkerley 848727f
Use Windows API for parsing out command line args.
jdunkerley be2429b
Last bits of Java tidy up.
jdunkerley ee7f518
Remove unused.
jdunkerley 2009d89
More logging tidy up.
jdunkerley e9a95fb
More logging tidy up.
jdunkerley 25895bd
Java format.
jdunkerley 13df435
Java format.
jdunkerley 6ea59ce
Reset last bits.
jdunkerley 3c9ace9
Prettier.
jdunkerley 1f8dab8
Remove console.debug statements.
jdunkerley 8929e6d
PR comments (Part 1).
jdunkerley 5752cab
Use unmanaged.
jdunkerley d435d83
Drop arg[0] as don't want that in args array.
jdunkerley 382cceb
Arguments at trace.
jdunkerley File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
lib/java/os-environment/src/main/java/org/enso/os/environment/Arguments.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| package org.enso.os.environment; | ||
|
|
||
| import org.enso.common.Platform; | ||
|
|
||
| public sealed interface Arguments permits WindowsArguments, LinuxArguments { | ||
| static Arguments getCurrent() { | ||
| return switch (Platform.getOperatingSystem()) { | ||
| case LINUX, MACOS -> LinuxArguments.INSTANCE; | ||
| case WINDOWS -> WindowsArguments.INSTANCE; | ||
| }; | ||
| } | ||
|
|
||
| String[] alterArgs(String[] originalArgs); | ||
| } |
12 changes: 12 additions & 0 deletions
12
lib/java/os-environment/src/main/java/org/enso/os/environment/LinuxArguments.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| package org.enso.os.environment; | ||
|
|
||
| final class LinuxArguments implements Arguments { | ||
| static final LinuxArguments INSTANCE = new LinuxArguments(); | ||
|
|
||
| private LinuxArguments() {} | ||
|
|
||
| @Override | ||
| public String[] alterArgs(String[] originalArgs) { | ||
| return originalArgs; | ||
| } | ||
| } |
101 changes: 101 additions & 0 deletions
101
lib/java/os-environment/src/main/java/org/enso/os/environment/WindowsArguments.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,101 @@ | ||
| package org.enso.os.environment; | ||
|
|
||
| import java.util.List; | ||
| import org.enso.common.Platform; | ||
| import org.graalvm.nativeimage.ImageInfo; | ||
| import org.graalvm.nativeimage.StackValue; | ||
| import org.graalvm.nativeimage.c.CContext; | ||
| import org.graalvm.nativeimage.c.function.CFunction; | ||
| import org.graalvm.nativeimage.c.struct.CPointerTo; | ||
| import org.graalvm.nativeimage.c.type.CIntPointer; | ||
| import org.graalvm.nativeimage.c.type.CTypeConversion; | ||
| import org.graalvm.word.PointerBase; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| @CContext(WindowsArguments.Directives.class) | ||
| final class WindowsArguments implements Arguments { | ||
| static final WindowsArguments INSTANCE = new WindowsArguments(); | ||
|
|
||
| private WindowsArguments() {} | ||
|
|
||
| @Override | ||
| public String[] alterArgs(String[] originalArgs) { | ||
| if (!ImageInfo.inImageRuntimeCode()) { | ||
| return originalArgs; | ||
| } | ||
|
|
||
| return readCommandLineArgs(); | ||
| } | ||
|
|
||
| private static final Logger LOGGER = LoggerFactory.getLogger(WindowsArguments.class); | ||
|
|
||
| private static final int WCHAR_SIZE = 2; | ||
|
|
||
| private static String[] readCommandLineArgs() { | ||
| var cmd = GetCommandLineW(); | ||
|
|
||
| CIntPointer numOfArgs = StackValue.get(Long.BYTES); | ||
|
|
||
| WCharPointerPointer args = CommandLineToArgvW(cmd, numOfArgs); | ||
| try { | ||
| var numArgs = numOfArgs.read(); | ||
|
|
||
| var results = new String[numArgs - 1]; | ||
| for (var i = 0; i < results.length; i++) { | ||
| var arg = args.read(i + 1); | ||
| results[i] = toJavaString(arg); | ||
| LOGGER.trace("Read command line argument {}: {}", i, results[i]); | ||
| } | ||
|
|
||
| return results; | ||
| } finally { | ||
| LocalFree(args); | ||
| } | ||
| } | ||
|
|
||
| private static String toJavaString(WCharPointer arg) { | ||
| return CTypeConversion.asByteBuffer(arg, wcslen(arg) * WCHAR_SIZE) | ||
| .order(java.nio.ByteOrder.LITTLE_ENDIAN) | ||
| .asCharBuffer() | ||
| .toString(); | ||
| } | ||
|
|
||
| @CPointerTo(nameOfCType = "wchar_t") | ||
| private interface WCharPointer extends PointerBase {} | ||
|
|
||
| @CPointerTo(WCharPointer.class) | ||
| private interface WCharPointerPointer extends PointerBase { | ||
| WCharPointer read(int index); | ||
| } | ||
|
|
||
| @CFunction | ||
| private static native WCharPointer GetCommandLineW(); | ||
|
|
||
| @CFunction | ||
| private static native WCharPointerPointer CommandLineToArgvW( | ||
| WCharPointer cmdLine, CIntPointer numArgsOut); | ||
|
|
||
| @CFunction | ||
| private static native int wcslen(WCharPointer str); | ||
|
|
||
| @CFunction | ||
| private static native void LocalFree(PointerBase p); | ||
|
|
||
| static final class Directives implements CContext.Directives { | ||
| @Override | ||
| public boolean isInConfiguration() { | ||
| return Platform.getOperatingSystem().isWindows(); | ||
| } | ||
|
|
||
| @Override | ||
| public List<String> getHeaderFiles() { | ||
| return List.of("<windows.h>", "<wchar.h>"); | ||
| } | ||
|
|
||
| @Override | ||
| public List<String> getLibraries() { | ||
| return List.of("Kernel32", "Shell32"); | ||
| } | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,42 +1,71 @@ | ||
| package org.enso.os.environment.chdir; | ||
|
|
||
| import java.nio.ByteOrder; | ||
| import java.nio.charset.StandardCharsets; | ||
| import java.util.List; | ||
| import org.enso.common.Platform; | ||
| import org.graalvm.nativeimage.UnmanagedMemory; | ||
| import org.graalvm.nativeimage.c.CContext; | ||
| import org.graalvm.nativeimage.c.function.CFunction; | ||
| import org.graalvm.nativeimage.c.type.CCharPointer; | ||
| import org.graalvm.nativeimage.c.type.CTypeConversion; | ||
| import org.graalvm.word.PointerBase; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| @CContext(WindowsWorkingDirectory.Directives.class) | ||
| final class WindowsWorkingDirectory extends WorkingDirectory { | ||
|
|
||
| static final WindowsWorkingDirectory INSTANCE = new WindowsWorkingDirectory(); | ||
| private static final Logger LOGGER = LoggerFactory.getLogger(WindowsWorkingDirectory.class); | ||
|
|
||
| // WChars in windows are 2 bytes | ||
| private static final int WCHAR_SIZE = 2; | ||
|
|
||
| // Windows MAX_PATH is 260 and MAX_PATH_WIDE is 32767 | ||
| private static final int MAX_LENGTH = 32767; | ||
|
|
||
| private static String wcharPtrAsString(PointerBase buffer, int length) { | ||
| return CTypeConversion.asByteBuffer(buffer, length * WCHAR_SIZE) | ||
| .order(ByteOrder.LITTLE_ENDIAN) | ||
| .asCharBuffer() | ||
| .toString(); | ||
| } | ||
|
|
||
| private static PointerBase stringAsWCharPtr(String input) { | ||
| var bytes = input.getBytes(StandardCharsets.UTF_16LE); | ||
| var buffer = UnmanagedMemory.malloc(bytes.length + 2); | ||
| CTypeConversion.asByteBuffer(buffer, bytes.length + 2) | ||
| .order(ByteOrder.LITTLE_ENDIAN) | ||
| .put(bytes) | ||
| .put(new byte[] {0, 0}); | ||
| return buffer; | ||
| } | ||
|
|
||
| @Override | ||
| public String currentWorkingDir() { | ||
| byte[] buf = new byte[4096]; | ||
| String path; | ||
| try (var ptrHolder = CTypeConversion.toCBytes(buf)) { | ||
| var ptr = ptrHolder.get(); | ||
| var ret = GetCurrentDirectoryA(4096, ptr); | ||
| if (ret == 0) { | ||
| LOGGER.error("GetCurrentDirectory failed with {}", ret); | ||
| var buffer = UnmanagedMemory.malloc(MAX_LENGTH * WCHAR_SIZE); | ||
| try { | ||
| int length = GetCurrentDirectoryW(MAX_LENGTH, buffer); | ||
| if (length == 0 || length == MAX_LENGTH) { | ||
| LOGGER.error("GetCurrentDirectory failed with length {}", length); | ||
| return null; | ||
| } else { | ||
| path = new String(buf); | ||
| } | ||
|
|
||
| var result = wcharPtrAsString(buffer, length); | ||
| LOGGER.debug("Current working directory is {}", result); | ||
| return result; | ||
| } finally { | ||
| UnmanagedMemory.free(buffer); | ||
| } | ||
| return path.trim(); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean changeWorkingDir(String path) { | ||
| path = normalizeSlashes(path); | ||
|
|
||
| try (var cPath = CTypeConversion.toCString(path)) { | ||
| var res = SetCurrentDirectoryA(cPath.get()); | ||
| var buffer = stringAsWCharPtr(path); | ||
| try { | ||
| var res = SetCurrentDirectoryW(buffer); | ||
| if (res == 0) { | ||
| LOGGER.error("SetCurrrentDirectory to {} failed with {}", path, res); | ||
| return false; | ||
|
|
@@ -45,6 +74,8 @@ public boolean changeWorkingDir(String path) { | |
| } catch (Throwable t) { | ||
| LOGGER.error("Cannot change working directory to " + path + " on Windows", t); | ||
| throw t; | ||
| } finally { | ||
| UnmanagedMemory.free(buffer); | ||
| } | ||
| } | ||
|
|
||
|
|
@@ -53,12 +84,15 @@ public boolean exists(String dir, String file) { | |
| dir = normalizeSlashes(dir); | ||
| file = normalizeSlashes(file); | ||
| var full = dir + Platform.separatorChar() + file; | ||
| try (var cPath = CTypeConversion.toCString(full)) { | ||
| var res = PathFileExistsA(cPath.get()); | ||
| var buffer = stringAsWCharPtr(full); | ||
| try { | ||
| var res = PathFileExistsW(buffer); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, let's use |
||
| return res != 0; | ||
| } catch (Throwable t) { | ||
| LOGGER.error("Cannot check if {} exists on Windows", full, t); | ||
| return false; | ||
| } finally { | ||
| UnmanagedMemory.free(buffer); | ||
| } | ||
| } | ||
|
|
||
|
|
@@ -77,23 +111,23 @@ private static String normalizeSlashes(String path) { | |
| * docs</a> | ||
| */ | ||
| @CFunction | ||
| static native int GetCurrentDirectoryA(int nBufferLength, CCharPointer lpBuffer); | ||
| static native int GetCurrentDirectoryW(int nBufferLength, PointerBase lpBuffer); | ||
|
|
||
| /** | ||
| * <a | ||
| * href="https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setcurrentdirectory">Official | ||
| * docs</a> | ||
| */ | ||
| @CFunction | ||
| static native int SetCurrentDirectoryA(CCharPointer lpPathName); | ||
| static native int SetCurrentDirectoryW(PointerBase lpPathName); | ||
|
|
||
| /** | ||
| * <a | ||
| * href="https://learn.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-pathfileexistsa">Official | ||
| * docs</a> | ||
| */ | ||
| @CFunction | ||
| static native int PathFileExistsA(CCharPointer pszPath); | ||
| static native int PathFileExistsW(PointerBase pszPath); | ||
|
|
||
| static final class Directives implements CContext.Directives { | ||
| @Override | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -66,6 +66,19 @@ public void changeDir() throws IOException { | |
| assertEquals(tmpDirAbs, curDir); | ||
| } | ||
|
|
||
| @Test | ||
| public void changeDirUnicode() throws IOException { | ||
| var tmpDir = TMP_DIR.newFolder().toPath(); | ||
| var subDir = tmpDir.resolve("使用者"); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice test! |
||
| var dirCreated = subDir.toFile().mkdir(); | ||
| assertTrue(dirCreated); | ||
| var subDirAbs = subDir.toAbsolutePath().toRealPath().toString(); | ||
| var succeeded = nativeApi.changeWorkingDir(subDirAbs); | ||
| assertTrue(succeeded); | ||
| var curDir = nativeApi.currentWorkingDir(); | ||
| assertEquals(subDirAbs, curDir); | ||
| } | ||
|
|
||
| @Test | ||
| public void changeDir_NonExistingDir() throws IOException { | ||
| var tmpDir = TMP_DIR.newFolder().toPath(); | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we run on Windows that don't have
powershell?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been built into Windows since Windows 7 I think - so think ok to assume its there.