Read the blog post at FEX-Emu's Site!
Welcome back to another new FEX-Emu release in the new year! While everyone was out celebrating the holidays, we still managed to get some work done.
So let's get in to what we did this last month!
Official WINE WoW64 and Arm64ec package support
This month we have updated our Ubuntu ppa repository to now support a fex-emu-wine package. This package provides wow64 and arm64ec emulator DLL files
that can be applied directly to an AArch64 build of WINE, thus allowing you to do x86 and x86-64 emulation inside of WINE directly and removing a ton
of CPU overhead in the emulation! This is relatively fresh so there will be some teething issues around getting it setup, like the current upstream
WINE may not integrate directly in to these builds yet. Check out our wiki for more
information about getting this hooked up.
Partial support for inline self-modifying code and trap flag
As we work towards supporting more edge-case behaviour of anti-tamper and anti-debugger software. We have spent some time this month implementing support for inline self-modifying code and the trap flag.
In particular Denuvo uses inline self-modifying code which is relatively annoying to support, but we can use the fact that it tends to generate
invalid instructions to determine that a block of code is invalid early, thus letting it work. There's some more work towards making this more robust
but this gets a decent number of games running.
The trap flag on the other hand is interesting because this is an anti-debugger tactic that some badly behaving launchers use. This is because of how
debuggers treat the trap flag versus how it works when a debugger isn't running, this lets the application detect the debugger and throw an error.
FEX didn't quite handle this correctly which was causing these launchers to throw their hands up and stop running.
A note is that some of this work is only wired up on the WINE side rather than the FEX-Emu Linux emulation side, so mileage may vary!
JIT bug fixes and performance improvements
As usual, a lot of fixes landed for our JIT, ranging from incorrect backpatching of unaligned atomics, to incorrect instruction handling, to improving
performance of a couple of instructions. Let's break down what we fixed this month.
Fixed backpatching of unaligned atomics with small immediates
ARM's FEAT_LRCPC2 extension added TSO instructions for small immediate offsets in the range of -256 to 255. These still have the regular atomic
limitation of ARM where the address needs to be naturally aligned (or within 16 byte granule!) of the access type. FEX needs to emulate unaligned
memory accesses from x86 by backpatching these instructions to be a DMB plus load or store. We were incorrectly patching these instructions with the
small offsets. This will improve stability of emulation on hardware that supports the new FEAT_LRCPC2 instructions
Fix float to integer overflow behaviour
This is a very important change for how FEX handles when converting a float value to an integer and an overflow occurs. While we knew of the problem,
we didn't realize how wide reaching the problem was causing problems. In particular this fixes The Talos Principle's audio cuting out, Animal Well's
music having chirping artifacts, SOMA not allowing interactions with things in the world, Satisfactory's server crashing, and Metaphor Refantazio
infinite looping before getting in-game!
There are sure to be a bunch of other little fixes that this also fixes because it's a pervasive problem that games rely upon!
Fix ModRM decoding of 3DNow! instructions
While 3DNow! isn't used in any recent games, to the point that AMD has removed the instruction set from Zen CPU cores, older games still use this
extension if possible. Turns out we had a gap in our testing infrastructure for when a 3DNow! instruction used the SIB encoding form of the
instruction. This would result in crashes and misinterpreting of instructions. This will fix some older 32-bit games using 3DNow! and of course we
added new unittests to our testing infrastructure to make sure it keeps working.
Fixes H0F3A table decoding
This fix doesn't affect any known applications, but because of how x86 compilers aggressively pad instruction sizes, this could crop up anywhere
without us noticing. When the H0F3A instruction table gets decoded, FEX was incorrectly applying the REX_W prefix to instructions that would ignore
the prefix. Out of all the instructions in the table, only three actually care about the prefix while the others always ignore it. If this padding
occured then FEX would think it is an unknown instruction and crash. This has now been resolved which should keep us from ever hitting the issue.
Generate 80-bit SVE loadstores when necessary
For all the users that have SVE supporting hardware (There aren't a lot of you!), we have added a new optimization that converts two loads or stores
in to a single 80-bit masked loadstore instruction. While this isn't going to be a huge improvement because this only occurs with x87 code, it's
another little optimization in the list of things that SVE improves for x86 emulation.
Increase minimum kernel requirement from 5.0 to 5.15
We're moving in to the future with some changes that require increasing our minimum kernel version. Because we were allowing such an old version of
the Linux kernel, we were hitting some heartburn in some codepaths. In order to make this easier, we are moving up the minimum kernel requirement to
an LTS release of the kernel released back in 2021 already! We don't expect this to cause too many problems, since this is an kernel supported by
Ubuntu for 22.04
Drop official support for ArchLinux
Due to a clarification from the ArchLinux team this last month, they are no longer allowing packages in the AUR that don't support x86-64. Due to this
change and that FEX only supports running on an AArch64 host, they have removed our official packages from AUR. There's nothing that we can do about
this besides dropping support for ArchLinux.
Raw Changes
FEX Release FEX-2501
-
ArchHelpers
-
Arm64
-
Fixes LDAPUR and STLUR backpatching (1e827ec)
-
ConstProp
-
fix 32-bit masking behaviour (c902b88)
-
Context
-
Constify GPRs passed to ReconstructCompactedEFLAGS (a86c922)
-
External
-
Update bundled libfmt (7e257cc)
-
FEXCore
-
Emulate EFLAGS.TF (e88c92d)
-
Override x87 precision control when necessary (8111b7c)
-
Don't
WaitForEmptyJobQueue
if CodeObjectCacheService isn't used (5a4691f) -
FEXLoader
-
Increase minimum kernel requirement from 5.0 to 5.15 (6bc7a83)
-
Enable early logs output to stderr (e32c538)
-
Frontend
-
Fix ModRM handling with 3DNow! (15a1a0f)
-
GdbServer
-
Fixes encoding of hex (735a4f9)
-
Support 32-bit context definitions (072cf4c)
-
Implement support for
$vKill
(46fb858) -
IR
-
Change convention from number of elements to elementsize (a6c67ca)
-
Passes
-
Adds missing comment that clang-format keeps complaining about locally (b03b02d)
-
InstCountCI
-
Adds more LRCPC2 tests that are missed (cd6722f)
-
Implement support for TSO and LRCPC and add hot block that could be optimized (9fb69ed)
-
InstructionCountCI
-
add some hot blocks from Factorio (e44d1f1)
-
Linux
-
Fixes typo in removing RESOLVE_IN_ROOT flag (e55b5d0)
-
FaultSafeUserMemAccess
-
Break out fault safe handler (57178ab)
-
LinuxEmulation
-
Don't use clone3 for fork (71187d3)
-
LinuxSyscalls
-
Log unhandled clone3 fork flags (c3261b4)
-
Ensure CSIGNAL is merged back in to flags for clone2 (c7fb95a)
-
Fixes exit syscall (bdae4f6)
-
OpcodeDispatcher
-
Fixes FEX's H0F3A table handling of REX.W (90b1ac4)
-
Minor division improvement (04e785e)
-
ThreadManager
-
Add some sanity asserts (d8ef702)
-
Threads
-
Fix memory leak in joinable() (f906c6a)
-
Thunks
-
gen
-
Add support for compiling against clang 19 (7b2fc37)
-
Utils
-
FileLoading
-
Fix LoadFileImpl (527752c)
-
Windows
-
Only deinit the thread CRT when destroying the current thread (d2bac45)
-
Track RWX regions in mapped images (27ededf)
-
Misc
-
Just a few things picked up from static analysis (8913c59)
-
Support a merged RootFS (and a bunch of related fixes) (2d66bc2)
-
Fix float->int conversion overflow behaviour (d2f86e4)
-
Library Forwarding: Allow reading standard library headers from a development x86 rootfs (d66cd16)
-
Support inline self modifying code (656477e)
-
Generate SVE for 80bit load/stores when possible (8427731)
-
docs
-
Remove Arch from the release process. (f8b6edf)
-
unittests
-
Adds a 3DNow! ModRM SIB encoding test (3abe6c1)
-
ASM
-
Fix incorrect instruction form test (b391fe6)
-
Adds missing MMX PADDQ test (fc1b500)
-
gvisor
-
Disable memfd tests (8bee101)
-
x87StackOptimizationPass
-
Minor opt to f80 fchs and fabs (f51812a)