Skip to content

Conversation

@dra27
Copy link
Owner

@dra27 dra27 commented Feb 14, 2023

While working on this, there are many things which I have landed on in passing, and which are included in this combined PR as follow-up commits. These 23 proposed PRs stand alone, except that 5 requires 4; many of them are single commits.

Bytecode executable launcher enhancements and bug fixes

Relocatable OCaml - in particular PR#runtime-searching - involved a lot of rework to stdlib/header.c (including #13988). There are various other issues which I have addressed while working on Relocatable OCaml, but which are not critical parts of it:

1: Support paths longer than MAX_PATH in the Windows bytecode executable launcher (#216)

The Windows API has a very long-standing limitation of 260 characters (including the NUL terminator) for paths, defined in the MAX_PATH variable. Although there have been always low-level mechanisms for working around this, some support was added in Windows 10 for making this more readily available in higher-level API functions.

This PR doesn't enable this support directly, but it performs the necessary refactoring in stdlib/header.c to switch from using MAX_PATH to a more Posix-like PATH_MAX. The change boils down to using malloc (except that stdlib/header.c doesn't have access to the CRT in the Windows implementation so uses the underlying HeapAlloc directly). The first commit is a mechanical refactoring which makes the second commit slightly simpler.

2: Move caml_search_dll_in_path to dynlink.c (#218)

In the midst of various refactoring related to stdlib/header.c and checking the memory allocation in runtime/startup_byt.c, this small piece of duplication stuck out a little. caml_search_dll_in_path is only used in runtime/dynlink.c and thanks to all the char_os, etc., added in #1200 the two implementations in unix.c and win32.c can be trivially unified with just an OS-specific define for the extension of a DLL.

3: Handle parasitic argc values in header.c (#220)

I've spent far too much of the last months looking at how argv et al work with respect to bytecode launching. In passing, it is possible to exec a program with no arguments (including no argv[0], therefore). Given what header.c does with argv[0], it does seem better to ensure it's actually valid.

4: Remove unnecessary Cygwin path workarounds (#221)

The PATH-searching code in both the runtime and stdlib/header.c contains some convoluted and entirely correct-at-the-time code to deal with confusions which arise with the .exe extension in Cygwin.
Cygwin executables do not require the .exe extension to be included for a file to be executable, in order to allow them to be executed from the Windows Command Processor, they are typically installed with .exe extensions. Originally, Cygwin made it such that stat and the exec functions magically added .exe when needed, but file I/O operations did not. This causes no end of confusion, since both foo and foo.exe can exist. It was all the more painful for OCaml bytecode launching, as ocamlrun needs to read the executable, not exec it.
However, Cygwin 1.5.20 (July 2006) added the transparent_exe option to the CYGWIN environment variable which made open behave in the same way as stat, at least mitigating some of the confusion. Cygwin 1.7.1 (December 2009 and, despite the version number, the first release of Cygwin 1.7) made this behaviour default (and removed the ability to turn it off).
The Cygwin-specific part of the code here never actually triggers today, therefore - the .exe will be automatically added in cygwin_file_exists and the code for adding .exe will never trigger.

5: Share Unix path-search code between runtime and header.c (#222)

Both the runtime and stdlib/header.c contain an implementation of PATH lookup. In the midst of all the various refactorings, and in particular because I want to share the caml_executable_name implementation, this PR removes the duplication by linking stdlib/header.c with libcamlrun.a, but ensuring that the PATH-searching function is in an object of its own, so it becomes the only thing which is linked. The only fractionally nefarious trick is that stdlib/header.c has to implement caml_stat_alloc.
In the interests of simplicity, this change is predicated on the previous PR for removal of the pre-Cygwin 1.7 workaround for .exe lookup.

6: Use caml_executable_name in header.c (#223)

Sys.executable_name was added in OCaml 3.05, initially with the only enhanced support being the Linux-based /proc/self/exe, with Windows support added in OCaml 4.02 and macOS support from 4.05 in #795.
The bytecode executable header in stdlib/header.c has always used the same underlying API function as caml_executable_name for the Windows implementation, but the less-used Unix implementation has always used a PATH-search. This can be a problem - exercised in tests in #14014 - on Cygwin, where the Unix executable header is the default for user-created executables.
This PR uses the same technique as 5 to share caml_executable_name with stdlib/header.c. This has a considerable security-hardening effect, as it means that having exec'd a tendered bytecode executable, platforms where caml_executable_name is available definitely read that executable, and cannot be directed by abuse of argv to read something else (the security implication should not be over-blown: using tendered bytecode executables in a permissions-sensitive context is already bad, and if bytecode must be used, -output-complete-exe should be preferred). At present Cygwin doesn't actually implement caml_executable_name, but that is a separate piece of future work to add caml_executable_name across the board, as PR#enable-relative very much requires that too.

7: Fix hand-off of bytecode image from header.c to ocamlrun (#224)

The bytecode executable launcher (stdlib/header.c) is always used on native Windows instead of shebang (#!) lines/scripts. It is also used for executables produced by ocamlc on the Cygwin port.
On both Unix and Windows, there are two different, but related, ways in which this can fail.
The bytecode executable launcher (stdlib/header.c) performs two functions: it has to read the RTNM bytecode section from itself to find out how to locate the interpreter (ocamlrun) and then it has to instruct that interpreter to execute the same image.
On Windows, one can observe this behaviour from any installation of OCaml which is in Path:

C:\Users\DRA>ocamlc.byte.exe -v
The OCaml compiler, version 5.3.1
Standard library directory: C:\Users\DRA\AppData\Local\opam\ocaml-5.3\lib\ocaml

C:\Users\DRA>ocamlc.byte -v
no bytecode file specified

On Unix, the story is a little longer:

$ ./configure --prefix "$PWD/install" --with-target-sh=exe && make -j && make install
$ echo 'print_endline Sys.argv.(0)' > foo.ml
$ PATH="install/bin:$PATH" ocamlc -o foo.byte foo.ml
$ PATH="install/bin:$PATH" ocamlopt -o foo.opt foo.ml
$ ./foo.opt ; ./foo.byte
./foo.opt
./foo.byte
$ ( exec -a bar ./foo.opt ) ; ( exec -a bar ./foo.byte )
bar
bar not found or is not a bytecode executable file

The Unix behaviour arises because stdlib/header.c uses argv[0] to locate itself, and fails to find bar. Windows uses GetModuleFileNameW, rather than the command line. If the Unix version is altered to do the equivalent (as in 6 using caml_executable_name), then we instead see:

$ ./foo.opt ; ./foo.byte
./foo.opt
/home/dra/relocatable/ocaml/foo.byte
$ ( exec -a bar ./foo.opt ) ; ( exec -a bar ./foo.byte )
bar
/home/dra/relocatable/ocaml/foo.byte

which is not quite the same yet. This behaviour arises because having resolved where the bytecode executable is, stdlib/header.c then clobbers argv[0] with it. If we remove this line from stdlib/header.c:

  argv[0] = truename;

then we see something similar to the Windows error:

$ ./foo.opt ; ./foo.byte
./foo.opt
./foo.byte
$ ( exec -a bar ./foo.opt ) ; ( exec -a bar ./foo.byte )
bar
no bytecode file specified

The error in both cases is because while the handoff from stdlib/header.c to the runtime worked, ocamlrun then could not determine where the bytecode image to load is.

On Windows, when running ocamlc.byte (rather than ocamlc.byte.exe), the Command Processor correctly resolves ocamlc.byte to C:\Users\DRA\AppData\Local\opam\ocaml-5.3\bin\ocamlc.byte.exe and runs it (but note that argv[0] will still be ocamlc.byte). stdlib/header.c then loads itself (via GetModuleFileNameW), reads RNTM and hands off to C:\Users\DRA\AppData\Local\opam\ocaml-5.3\bin\ocamlrun.exe with the same command line. ocamlrun's start-up routines will then pass ocamlc.byte (which is argv[0]) to SearchPathW. Because ocamlc.byte contains a ., SearchPathW will not append .exe and fails. ocamlrun then assumes it's being invoked as ocamlrun and doesn't recognise the -v argument, hence the message about no bytecode file specified. There are two even stranger observations about this behaviour:

  1. The Command Processor does not exhibit it, but it is reasonable to assume that it does not use SearchPathW because the search operation of the Command Processor is more complex than that function allows (in particular, it doesn't implement the PATHEXT environment variable)
  2. If one specifies an absolute path (i.e. C:\Users\DRA\AppData\Local\opam\relocatable-5.3\bin\ocamlc.byte) then SearchPathW does appear to add the .exe extension. This appears to be undocumented behaviour of that function, bordering on a bug (albeit a useful one).

The Unix behaviour with exec -a is the same fundamental problem - ocamlrun cannot locate the bytecode image based on argv[0].

I am fairly sure that the Windows version of this bug has been seen in the wild - I think it forms the basis of some of the hearsay of "needing to add .exe" (along with Cygwin-based complexity). The Unix manifestation of this is just a wart, creating a difference between bytecode and native code (in passing, shebang scripts can never manipulate argv[0], so in this regard the executable header can obscurely be useful on Unix...).

I spotted the Windows error several years ago randomly at the terminal simply because I ran ocamlc.byte and was surprised by the answer! I strongly think this should be fixed, but the fix itself is involved.

The first two commits are a mildly hairy refactoring of caml_attempt_open. Previously, caml_attempt_open took a parameter name which it passed to caml_search_exe_in_path. This function allocates, and on success the pointer is updated to point to that freshly allocated string (corresponding to the actual file opened). The use of a pointer here (I think) is somewhat confusing, since on success the caller is potentially responsible for freeing the memory addressed by the original value of the pointer. Indeed, prior to #13728, there was a memory leak for -custom executables, since the string returned by caml_executable_name was never freed (this became moot in #13728 since proc_self_exe is now kept). The change therefore is to move the obligation for calling caml_search_exe_in_path to the caller, which allows caml_attempt_open just to perform validation on opening an already-resolved file. In updating this, it looks like CAML_DEBUG_FILE was performing a search in PATH which almost certainly wasn't intended - i.e. caml_attempt_open was used without realising/remembering that it may not open the file specified.

This simpler version of caml_attempt_open makes reasoning about the main bug-fix much easier.

The next commit then delves deep into the mysteries of process execution on Windows. There's a lovely blog post on this - of particular note is the fact it's more than 20 years old. This change is also what led to #13879 and #13921. The key thing is that when calling exec functions in the Microsoft CRT on Windows, open file descriptors (that's C fds, not HANDLEs) are inherited by the child. In other words, if fd 3 is open when exec is called in a Windows program, fd 3 will be open in the exec'd process...

... unless that process happens to be an OCaml bytecode executable! The fix for this obscure piece of behaviour is a fairly simple tweak in stdlib/header.c which - in combination with that blog post - then leads to the main bug fix at hand. CreateProcessW has to be passed a STARTUPINFOW structure, which we duly initialise with default values. The fix is that instead of creating a new STARTUPINFOW structure, we use GetStartupInfoW to retrieve ours and use that instead - i.e. in just the same way that stdlib/header.c passes on the command line it received to ocamlrun it also passes on the start-up information it received. This has other benefits - for example, if a bytecode program was instructed to start hidden via dwFlags, etc., this would now be honoured when ocamlrun is invoked, etc.

How is this related to CRT fds? Because cbReserved2 and lpReserved2 are in fact neither as reserved nor as "must be 0 / NULL" as the documentation implies. cbReserved2 is in fact the minimum size of the buffer pointed to by lpReserved2 (which is permitted to be up to 64KiB in total). The Microsoft CRT, by long-standing convention, sets cbReserved2 but never checks it (this fact has been capitalised by Cygwin since the dawn of time). If lpReserved2 is non-NULL, the first 4 bytes are an int n indicating the number of CRT fds present. Following that are n unsigned shorts containing (opaque) flags used for CRT accounting (e.g. whether the fd is opened in text mode, etc.). Following that are n HANDLE values (which are pointers) for the underlying OS handle. The value of the fd is determined by its position - i.e. stdin first, etc. Any fds which aren't in use will have 0 for the flags value and, more importantly, INVALID_HANDLE_VALUE for the underlying Win32 HANDLE (i.e. if the process is inheriting fd 3, it must have slots for the standard handles as well).

Now, finally, we can get to the bug fix. The problem we have is that ocamlrun doesn't know where the bytecode image is - but stdlib/header.c absolute did, in fact it's even opened it in order to read RNTM to be able to invoke the runtime! The fix, fundamentally, is to share this fd with ocamlrun. On Unix, that's a trivial matter of not closing it; on Windows we can either initialise, or - for completeness - update lpReserved2 to pass the fd.

The final requirement is to communicate both the filename and the file descriptor number to ocamlrun. On Unix, this is easy - __OCAML_EXEC_FD is set to fd,filename (e.g. 3,/home/dra/ocaml/foo.byte) and this is decoded with sscanf in startup_byt.c and the environment variable is unset.

There are two problems with this approach on Windows: firstly, the environment block is smaller, so passing filenames is a potential worry. Secondly, to considerably reduce its size, stdlib/header.c is not a CRT application. However, the Windows API comes to our rescue - while Posix does not have a function to get the filename from a file descriptor, Windows does have GetFinalPathNameByHandleW which we can combine with _get_osf_handle to recover the same filename as GetModuleFileNameW returned for stdlib/header.c. What we lack is the ability to format an integer as a string. However, also unlike Posix, on Windows we have the guarantee of being able to remove __OCAML_EXEC_FD from the environment. The largest fd permitted in the UCRT fits comfortably within a 16 bit integer and we know it'll never be 0 (stdin). The solution for Windows, therefore, is to pass CRT fd number as a single UCS-2 codepoint corresponding to the fd number in __OCAML_EXEC_FD - with the certainty that this strange variable gets removed from the environment during startup.

Hairy - yes. Stable - also, yes. I checked my old CRTs - this logic is right back to the dawn of Windows NT - it's never changing. Cygwin also uses it since, as the blog post also notes, you can make lpReserved2 bigger than cbReserved2 and pass whatever you like in lpReserved2, as long as the first four bytes (the fd count) are left low enough (or 0) so that the CRT ignores it. This approach should also be being used for Unix.create_process, where I am all but certain the lack of doing this is responsible for bugs when inheriting the standard handles between processes (because while OCaml-land correctly inherits the HANDLE values, CRT-land has the wrong opaque flags for stdin, stdout and stderr).

Environment variable handling

A key goal of Relocatable OCaml is mitigating the effect that one compiler installation is able to have on another, especially where environment variables are concerned. These three PRs arise from auditing all of our environment variables.

8: Document and clarify handling of set-but-empty environment variables (#225)

Unix and - contrary to generally received wisdom - Windows distinguish an environment variable which is set, but to the empty string, from one which is unset. On Windows, these are incredibly difficult to manipulate. On Unix, they are really quite difficult to manipulate (env -u is not portable, for example).

This PR, for portability and ease, updates the interpretation of various environment variables so that we consistently treat "set but null" as "unset". The changes are easily reviewed commit-by-commit.

9: Harden processing of SOURCE_DATE_EPOCH in ocamldoc (#226)

Previously, running SOURCE_DATE_EPOCH= ocamldoc resulted in an uncaught Failure "float_of_string" exception. The processing of SOURCE_DATE_EPOCH is firstly hardened to cope with parsing errors and then a one-time warning is displayed the first time it's actually used (at present it's only required in Odoc_man). This variable isn't ignored when empty, because it's part of reproducible builds, and the fact that there is a warning/error seems appropriate, just not the error which previously displayed.

10: Overhaul handling of empty components in PATH-like environment variables (#227)

Setting OCAMLLIB to an empty string has the somewhat annoying effect of breaking the distribution:

$ OCAMLLIB= ocaml
File "command line", line 1:
Error: Unbound module Stdlib

Not necessary an obvious thing to do, but it makes running the compiler in a cleaned environment trickier (e.g. https://github.com/ocaml/opam/blob/89b95a1c50d5df8e5bdcd98395e01aef6e50741c/Makefile#L55).

The first commit in this PR ignores OCAMLRUNPARAM, OCAMLLIB and CAMLLIB if they are set, but to the empty string. For OCAMLRUNPARAM, this has the minor consequence that CAMLRUNPARAM will be queried, for anyone who is still using Caml Light at the same time as OCaml. More importantly, while an empty value for OCAMLLIB also now causes CAMLLIB to be checked, it means that if CAMLLIB is not set at all (as is likely) then the compiler's default value will be used. This is - as with other treatment of empty environment variables - more consistent and is almost certainly what the user intended anyway.

The next commit provides similar treatment to CAML_LD_LIBRARY_PATH and OCAMLTOP_INCLUDE_PATH, but in this case it is also to ignore empty components. In particular, at present setting CAML_LD_LIBRARY_PATH to an empty string causes the runtime to add the current directory . to the search path is almost certainly never what it actually intended (this also happens because opam at present reverts environment variable changes by setting them to the empty string rather than unsetting them). The current directory can of course be explicitly added to both search paths by specifying ., but these two variables exhibiting the legacy Posix interpretation of PATH seems like an error. As the commit message notes, the handling was also inconsistent between between platforms, which made the test-in-prefix tests for this hilarious (and all of which gets deleted with this change).

Miscellany

11: Add -set-runtime-default to the compilers (#186)

This PR was worked on jointly with @MisterDA and extends the -set-runtime-default option added as part of PR#enable-relative. In an earlier version of Relocatable OCaml, this was the implementation of -set-runtime-default for Relocatable OCaml, but it became clearer to me when integrating it that it was better for this work to build on Relocatable OCaml than the other way around.

The core premise of this PR is to be able to change the default values of the runtime parameters (specified in OCAMLRUNPARAM). This, for example, allows a program to be compiled with a different initial size of the minor heap, or with randomized hashtables enabled by default. These parameters may still be overridden by the user specifying OCAMLRUNPARAM themselves, but it provides a coherent mechanism for an executive to have its own default values. The implementation supports all of the output mechanisms across bytecode and native code (which is tested).

The first commit cleans up a minor mistake in ocaml-multicore/ocaml-multicore#694 (which was one of the many preparatory diff-reduction PRs leading to #10831). Some logic had been lost in the bytecode startup, including the variables print_magic and print_config - however, as these are static to startup_byt, rather than global, they don't need to be put into the heavier-weight struct caml_params. The commit is polished with @MisterDA's fastidious attention to modern C standards.

When Sys.runtime_parameters was added in OCaml 4.03, a note was added in the code that it doesn't include R, since that wasn't processed in the C part of the runtime. The next commit finally addresses this - it is slightly fiddly, given the atomics. The next commit makes the order of the values returned by caml_runtime_parameters consistent.

-set-runtime-default is then added, using an unsurprisingly related implementation to that in PR#enable-relative. Runtime parameters are specified using their letter form, for example -set-runtime-default R turns on randomized hash tables by default. ocamlopt and ocamlc (when linking in "C" mode) implement this by creating a string caml_executable_ocamlrunparam which is then processed by the runtime using exactly the same machinery as for OCAMLRUNPARAM (which is both less code and easier to maintain - the additional work for new options is only updating the command line parser). For tendered bytecode executables, this string is instead put in a new ORUN section (cf. the OSLD section in PR#enable-relative). This new facility is then tested using test-in-prefix to set option R (which is observable in each of the compiled executables).

The last two commits (the horrors of which are entirely my own) extend the process of bytecode startup to allow specifying -set-runtime-default c (unloadable runtime) for tendered bytecode. The issue is that the C memory pool system can only be started up if everything that has been allocated with caml_stat_alloc has been freed with caml_stat_free. OCAMLRUNPARAM is processed very early in start-up, so the memory pool is running before any persistent allocations take place, but by the time ORUN has been read, there're already various buffers in play. The solution I've put in is to free and re-initialise everything if c=1 is not included in OCAMLRUNPARAM but is then found in ORUN. It works, and it's in a section of bytecode start-up which will hopefully sit largely unaltered after my year or so of bashing it around. I separated the implementation because -set-runtime-default c=1 works naturally for all the other linking modes, so we could choose to live with just prohibiting it for tendered bytecode. I'm not sure.

I haven't yet implemented it, but this mechanism could also be elegantly used on Windows to disable the automatic filename globbing (cf. https://github.com/ocaml/ocaml/issues/7473#issuecomment-473067359 and, much more recently, https://discuss.ocaml.org/t/15461). In particular, controlling this option with OCAMLRUNPARAM would be dangerous (or at least dodgy - programs expecting the runtime to have de-globbed shouldn't suddenly see them, etc.) but being able to issue -set-runtime-default noglob or some such and have an executable specify that could be very useful.

12: extern compatibility testing (#217)

During the tortuous process of finalising PR#runtime-searching, I temporarily had need of a version of output_value which would tell me if the value was loadable on a 32-bit system, as opposed to the COMPAT_32 flag which fails if the value isn't loadable on a 32-bit system.
This seems generally useful, so I've separated the implementation, even though the need for it disappeared. In its present form, it's a separate primitive, but I imagine it could be acceptable simply to change caml_output_value to return something other than unit?
This could also be trivially extended to return whether the value contains closures (i.e. just like CLOSURES, but return true if there actually were any) and, although less trivially, likewise for NO_SHARING.

13: coreboot without all the stripping (#215)

#340, in order to reduce the size of the boot artefacts, introduced tools/stripdebug.ml which copies a bytecode image, but removes the DBUG section. In #11149, I (ab)used this script further by also removing the CRCS section as part of making the bootstrap repeatable between Unix and Windows. In #12751, I abused this script still further by also stripping the RNTM section.
Until very recently, PR#enable-relative continued my cycle of artefact abuse and was going to add the OSLD section. But then I stopped. Enough, I said. Surely we can achieve the desired result without all this horrible filtered copying?
Well, funny you should ask.
For my first trickcommit, I adapt a little bit of #13745 and cleave the Meta module into two, despatching the parts only needed by the toplevel to its codebase, and the parts needed by Symtable there. Thanks to #11996, Dynlink already has its own copy of the bits which went to the toplevel. This change has an important consequence - it means that ocamlbytecomp.cma no longer contains any references to the caml_reify_bytecode primitive which is now only used in ocamltoplevel.cma and dynlink.cma.
Next, my handkerchief swiftly polishes bytelink.ml, introducing a link_files iterator which in a flash is transformed to a fold (this same trick is done in PR#enable-relative). This sleight of hand allows the bytecode linker to omit the CRCS section if no module being linked refers to caml_reify_bytecode. The connection here is that bytecode cannot be loaded without calling this primitive, and if bytecode isn't being dynamically loaded then there can't be a need to perform interface consistency checks against the running program, so the CRCS data is unnecessary.
This change is then combined with some build system loveliness, which is all a little easier to express thanks to the refactorings of @shindere over these last few years. In #11149, in order to make the bootstrap repeatable we accepted that boot/ocamlc has to be compiled with a different config.cmo (this is the config_main.cmo and config_boot.cmo part of the build).

  • The previous change ensures that CRCS will not be emitted for ocamlc (because it doesn't use Dynlink)
  • Since we have to link the bytecode image for boot/ocamlc with a different config.cmo, it really doesn't seem that bad to link it at that point without -g. In fact, we have to do that by using -no-g, but the effect is the same. We already have a clear marker with the IN_COREBOOT_CYCLE variable which is how the correct Config module is selected, so that then "removes" DBUG for us.
  • Finally, the -without-runtime option added in #2309 suppresses both the header and the RNTM section from being emitted.
    At which point, promote-cross goes back to just being a straight-forward cp.

Various bug fixes found writing #14014

14: Fix C library options for win32unix (#229)

For mostly historical reasons, the Windows build of the Unix library specifically specified -lws2_32 and -ladvapi32 in its link options. This hasn't been necessary for a long time, as these are both part of the default runtime options (I knew you'd want to know: since 3.11, thanks to the Windows implementation of ocamldebug, if you've keenly looked in the 3.11 tree and wondered why -ladvapi32 is missing from config/Makefile.mingw, then have a look at #12265).
This became a problem when working on #14014 as these two libraries get passed to the partial linker. While ocamlc knows not to pass the main C libraries to the partial linker, it doesn't distinguish the ones which come from .cma files.
This issue is indirectly fixed by 16, but it seemed worth fixing it properly as well. This PR removes those two options from unix.cma and unix.cmxa. However, they are explicitly required for unix.cmxs (because we're not linking a runtime then).

15: Build and install threads.cmxs (#230)

This was originally reported in #7625. ocamlnat performs a little trick if asked to load a .cmxa where it quickly produces a .cmxs file from it.
This doesn't work - almost uniquely - for systhreads, because the systhreads support stubs need to be compiled in either shared or static mode for Windows, in order to know how to access the pthreads functions in the runtime (from winpthreads). While this problem will hopefully disappear, it's strangely inconsistent - there's nothing wrong with loading threads with natdynlink (indeed, in OCaml 5.x, it's less weird than it used to be). This PR correctly compiles and installs threads.cmxs.
I haven't checked, but post #11996, I would imagine that it is possible to load compiler-libs as plugins (as it probably has been for a while) - not so sure about being able to use compiler-libs in ocamlnat, though.

16: Don't pass system libraries to ld -r (#235)

Related to 14, this tweaks the partial linker (-output-complete-obj) to stop system libraries from being passed to the linker at all. The adaptation in Ccomp is straight-forward: we look-up all the -l libraries using OCaml's lookup path (knowing that ld won't have one at all in partial linking mode) and if we can't resolve a library, we don't specify it at all.

17: Use entrypoint flexdll branch (#231)

This PR simply adopts ocaml/flexdll#146, which includes a full explanation of the problem. From our perspective, this allows unix.cmxs to be loadable in ocamlnat in Cygwin which, amusingly, has never worked.

18: Stop documenting bootstrap artefacts (#232)

#11149 added the generation of utils/config_main.ml and utils/config_boot.ml which is done to ensure that if fields are added to utils/config.generated.ml.in then they must be immediately added to utils/config.fixed.ml. The problem is that this caused Config_main and Config_boot to get added to the compiler-libs documentation, and we didn't notice 🫣
This PR fixes both the documenting and also the installing of these files. The easiest approach - given the risk of blowing command line lengths if we muck around with $(wildcard and friends, is simply to build those files in utils/config/ which excludes them from being installed and documented.

19: Use clang-cl for flexdll support objects (#233)

When we build with clang-cl, the flexlink support objects were still compiled with cl which got noticed by the relocation test in #14014. This PR fixes it.

20: Two minor clean-ups in the in-prefix-tests (#234)

One piece of dead code (from a very early version) and one blatantly wrong call in #14014 in an obscure code path.

Bytecode runtime tidying

21: Remove unused caml_cds_file (#219)

This micro-PR addresses a variable which should have been excised with #10831. The variable was replaced with caml_params->cds_file. It was spotted as a result of plumbing the refactoring of caml_attempt_open.

22: Improve error message when dynamic loading is unexpectedly unavailable (#228)

As part of Relocatable OCaml, I have spent a lottoo much time thinking about the portability of bytecode images between different runtimes, and also messing around withfine-tuning the bytecode startup process. This scenario is not entirely trivial to hit:

$ ./configure --prefix $PWD/install --disable-native-compiler && make -j && install
$ make distclean
$ ./configure --prefix $PWD/install-static --disable-shared --disable-native-compiler && make -j && make install
$ echo 'print_endline (Unix.getcwd ())' > foo.ml
$ PATH=install/bin:$PATH ocamlc -o foo -I +unix unix.cma foo.ml
$ install-static/bin/ocamlrun ./foo
Fatal error: cannot load shared library dllunixbyt
Reason: dynamic loading not supported on this platform
Aborted (core dumped)

That message is largely correct, but similarly in the toplevel one can get the slightly more confusing:

$ install-static/bin/ocaml
# #directory "install/lib/ocaml/unix";;
# #load "unix.cma";;
Cannot load required shared library dllunixbyt.
Reason: dllunixbyt.so: dynamic loading not supported on this platform.

What is particularly odd here is that it refers to dllunixbyt.so (which has been mangled from -lunixbyt in the invalid unix.cma) but the message fundamentally refers to a DLL which isn't in scope for the installation.

On top of that, there's a lot of machinery which is unnecessarily started up in a static build before any error is displayed. This PR changes those error messages to be, in my opinion, somewhat less unclear, so that at least if one hits them, it's slightly clearer as to why:

$ install-static/bin/ocamlrun ./foo
the file './foo' requires shared libraries to be loaded, which this runtime does not support
$ install-static/bin/ocaml
# #directory "install/lib/ocaml/unix";;
# #load "unix.cma";;
File unix.cma requires a shared library to be loaded, which the runtime executing this toplevel does not support.

The way I have implemented it somewhat unexpectedly actually requires two bootstraps, but don't let that scare you. The idea is to display the message pre-emptively before anything attempts to be loaded, which means the toplevel and Dynlink need to know if the runtime supports shared libraries. I've therefore added %shared_libraries to go with all the other runtime properties. That primitive has to be bootstrapped in before the toplevel can use it. For the runtime, I'm taking the presence of a DLLS section to mean that shared loading is required. At present, DLLS is always written even when it's empty, which requires a trivial change to the bytecode linker. However, that change does have to be bootstrapped, or the full distribution can't be built!

23: Dynlink-related tidying following ld.conf changes (#185)

This is a follow-on clean-up from #12599 and related to work on both Relocatable OCaml and around #13745.
caml_dynlink_get_bytecode_sections is used by both the toplevel and dynlink to retrieve information obtained during bytecode startup without requiring the bytecode image to be reloaded. In particular, the table of primitive names and the shared library path remain. The normal runtime does not require these after startup, and they are now freed after the first call to caml_dynlink_get_bytecode_sections. Note that caml_ext_table_free resets the size field to zero, so an incorrect second call to caml_dynlink_get_bytecode_sections simply gets empty lists, rather than use-after-free.
While sorting out this, the actual table used for the primitive names can be simplified and now just uses a single buffer - prior to #13745, the code was a good deal more complex, and the ownership of the strings less clear, so it made sense to strdup the names. It's now perfectly fine to use the buffer for the section itself.

@dra27 dra27 added the no-change-entry-needed Causes the check for a Changes entry to be skipped for PRs label Feb 14, 2023
@dra27 dra27 force-pushed the backport-trunk branch 4 times, most recently from 37996ce to f0547fc Compare February 18, 2023 10:25
@dra27 dra27 force-pushed the backport-trunk branch 4 times, most recently from f0547fc to 5a10c41 Compare June 15, 2024 18:32
@dra27 dra27 added the relocatable PRs related to the Relocatable Compiler project label Sep 24, 2024
@dra27 dra27 force-pushed the backport-trunk branch 2 times, most recently from a55334c to 06471c0 Compare September 24, 2024 21:41
@dra27 dra27 force-pushed the backport-trunk branch 4 times, most recently from 1592c2b to 61f4691 Compare September 29, 2024 08:47
@dra27 dra27 force-pushed the relocatable-base-trunk branch from 3f50436 to 2521032 Compare September 29, 2024 10:26
@dra27 dra27 force-pushed the backport-trunk branch 9 times, most recently from f2bdd10 to 67259ce Compare October 8, 2024 22:40
dra27 and others added 23 commits September 11, 2025 16:55
POSIX recognises empty components in a PATH-like variable as meaning "."
(the current directory). This is reflected in the processing of
OCAMLTOP_INCLUDE_PATH, CAML_LD_LIBRARY_PATH and ld.conf where either a
blank component or a blank line is interpreted as "."

Somewhat confusingly, this processing is applied inconsistently between
Unix and Windows (it's confusing given that Windows more readily
includes the current working directory by default in PATH searches).

It also has the side-effect that a "Set But Null" environment variable
is interpreted as "." which counter-intuitively makes
CAML_LD_LIBRARY_PATH= ocamlrun add the current working directory to the
search path.

Blank lines and empty components of both OCAMLTOP_INCLUDE_PATH and
CAML_LD_LIBRARY_PATH are now ignored. The current working directory can
still be explicitly included, of course, by adding a "." entry/line
where required.
Exposes the value of SUPPORT_DYNAMIC_LINKING.
It is possible, especially when using Dynlink, to end up in the
situation where a bytecode runtime which doesn't support dynamic loading
is asked to load support DLLs (e.g. a bytecode image with a DLLS
section, or a cma archive passed to the toplevel/Dynlink which has a
non-empty lib_dllibs list).

Previously, the error message would refer to the name of the first DLL
being loaded and simply state that dynamic loading is not supported. The
confusing part is that typically this would refer to a DLL which is not
on the system.

Now, the bytecode linker only writes DLLS and DLPT when there are
entries to write in them, and the runtime, toplevel and Dynlink provide
a direct explanation that dynamic loading is needed, but is not
available. In particular, the error now refers to the file which is
being loaded (i.e. the bytecode executable or the .cma file) rather than
a .so file which doesn't exist.
In the debug runtime, caml_prim_name_table remains for the lifetime of
the program, as it's used by instrtrace.c, but in normal operation, once
the list of primitives has been handed over to Dynlink, it's no longer
required. In the normal runtime, it's now freed after this handover.

In passing, strings themselves are no longer duplicated, as the code
path is a good deal simpler than it used to be, and the bytecode section
itself can reliably be used as the underlying buffer for
caml_prim_name_table.

caml_shared_libs_path is only kept at all to be handed over to Dynlink.
It, along with the two underlying buffers for CAML_LD_LIBRARY_PATH and
ld.conf are freed after the call.
They are only used by the bytecode runtime and can be made static to
startup_byt.
The setting for R was previous omitted in Sys.runtime_parameters, since
it was only processed directly by the Hashtbl module and not stored in
the runtime. Option R is now processed in caml_parase_ocamlrunparam and
stored to be accessed and updated via new primitives for the Hashtbl
module.

Co-authored-by: David Allsopp <[email protected]>
Consistency - options displayed in alphabetical order with the uppercase
letter appearing before the lowercase letter
When linking an executable, allows to set default to OCAMLRUNPARAM
values. This new OCAMLRUNPARAM string is accessible using the
"caml_executable_ocamlrunparam" symbol or is embedded in a bytecode
section.

Co-authored-by: Antonin Décimo <[email protected]>
The test programs in the run after the prefix has been renamed are
compiled with `-set-runtime-default R`, and the test program verifies
that Hashtbl.is_randomized returns the expected value.
In bytecode startup, defer allocating memory until after the bytecode
image has been loaded as far as possible.
The runtime's pooling mode has a slight Catch-22 problem for ocamlrun
when enabled using -set-runtime-default. Opening the bytecode file and
reading the ORUN section requires the memory subsystem.

In this revised version, caml_main in bytecode is particularly careful
to track exactly what will have been allocated prior to reading the ORUN
section and if ORUN requires the system to start pooling mode, the
runtime now takes temporary malloc'd copies of everything which has been
made so far so that it can be safely copied with a caml_stat_alloc
_after_ pooling mode has been enabled.
-lws2_32 and -ladvapi32 are already supplied by default, so they don't
need to be in unix.cma/unix.cmxa. However, they do need to be passed
when building unix.cmxs, and they were previously acquired via
unix.cmxa. Tweak the way LDOPTS is used in Makefile.otherlibs.common
(which now is only used for the unix library) so that it's correctly
passed to both ocamlopt and ocamlmklib.
Crucially, the corrects the flags used for creating a DLL on Windows,
allowing threads.cmxs to be loaded in ocamlnat.
ld -r (certainly in GNU binutils) has an empty search path - co-opt the
MSVC search code and always resolve libraries when partial linking,
except this time _ignore_ the ones which are missing. This seems to fit
the rest of -output-complete-obj, given that the _standard_ C libraries
are also omitted (-lm, -lpthread, etc.)
Fixes loading unix.cmxs in Cygwin64
Config_main and Config_boot are built to ensure in the build that
utils/config.generated.ml.in and utils/config.fixed.ml are kept in sync
(so that the next bootstrap doesn't unexpectedly break). However,
because these files were generated in the utils directory, they were
picked up both by the install recipe and also when generating API
documentation.

It's slightly hairy to remove the wildcards and use filter, because we
can easily end up with command lines which are too long (even on Unix),
so instead these two modules are now generated in utils/config/
Dead code in the Makefile and the less-trodden path in
Test_ld_conf.ensure_dir contained an obvious incorrect function call...
Alternate version of caml_output_value which returns a boolean
indicating if the result was 32-bit compatible and would have succeeded
if Compat_32 had been included in the flags.
All the functions in Meta are now only required by the toplevel, however
two of them are still quite tangled up with Symtable. Begin the process
of disentanglement by moving those two functions to Symtable and the
remaining ones directly to the bytecode toplevel (Dynlink already has
its own implementations).
Bytecode images (including for -output-obj) now only include the CRCS
section if the image actually needs dynamic loading (which is identified
by the use of the caml_reify_bytecode primitive which is only used by
the toplevel and dynlink).
In order to be repeatable, the coreboot cycle routinely has to relink
both boot artefacts with a fixed configuration. It's not therefore much
more of a stretch to link those artefacts with the required flags to
suppress both the header (and RNTM section, if applicable) and debugging
information.

Combined with the previous change to suppress CRCS when the image
doesn't use dynamic loading, the coreboot cycle can be accomplished
without having to post-process the artefacts with stripdebug.
@dra27 dra27 force-pushed the backport-trunk branch 2 times, most recently from d3c8fc6 to 70c7069 Compare September 11, 2025 17:03
@dra27 dra27 changed the base branch from relocatable-base-trunk to trunk September 13, 2025 13:38
@dra27 dra27 force-pushed the backport-trunk branch 3 times, most recently from 57a0735 to d8552ac Compare September 14, 2025 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: Full matrix Full CI test matrix relocatable PRs related to the Relocatable Compiler project run-crosscompiler-tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants