Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modules break with compiler optimisation greater than -O0 #1

Open
frno7 opened this issue Sep 5, 2019 · 11 comments
Open

Modules break with compiler optimisation greater than -O0 #1

frno7 opened this issue Sep 5, 2019 · 11 comments
Labels
bug Something isn't working

Comments

@frno7
Copy link
Owner

frno7 commented Sep 5, 2019

# FIXME: -O0 -> -O2

It is presently unclear why IOP modules don’t quite work with the -O2 compiler optimisation option.

@frno7 frno7 added the bug Something isn't working label Sep 5, 2019
@uyjulian
Copy link

There are some additional workarounds for GCC >=5.3.0 in ps2sdk: https://github.com/ps2dev/ps2sdk/blob/master/iop/Rules.make#L33

Additionally, you can use -Q --help=optimizers for ex. -O2 to see which flags are enabled by -O2. You can then do a binary search to see which optimization breaks it, and disable that optimization (e.g. -fno-whatever)

@frno7
Copy link
Owner Author

frno7 commented Jan 15, 2021

Oh, that’s great, @uyjulian! I fear that 100+ (or so) reboots are needed to find the error so anything that can cut that number down is very much welcome. A lot of stuff could have happened between GCC 5.3 and the presently used 10.2 too, but with a bit of guidance and luck, one might find it quickly.

@rickgaiser
Copy link

Offtopic, but at ps2dev / ps2sdk we've recently migrated the EE compiler to gcc 10.2. There's some interesting patches that might be usefull for you as well (one of them is even yours ;):
https://github.com/ps2dev/gcc/tree/ee-v10.2.0
https://github.com/ps2dev/binutils-gdb/tree/ee-v2.35.1

Note that we found a bug in sqrt.s in binutils. It's claimed to be a standard MIPSII instruction, but it's not. This was also something that worked with -O0, becouse then the function sqrtf() is called. But any optimization level higher would cause gcc to replace it with a faulty sqrt.s instruction.

@frno7
Copy link
Owner Author

frno7 commented Jan 15, 2021

Nice, the short loop bug fix by frno7 has a bit of background in the post MIPS: Fix GCC noreorder for undefined R5900 short loops to the GCC patches mailing list. Some more work is needed to merge it. I’ve made a special r5900check tool that can find all occurrences of the short loop bug in general ELF files, including hand-coded assembly (of which the Linux kernel has some).

I’ve also made a write-up of the R5900 short loop erratum on the wiki, including porting the comment from the depracted GCC branch to GAS mainline (where it belongs), for handy reference in case anybody needs it explained. Likewise with the peculiarities of the R5900 floating point unit (FPU), etc. When I started out I found it difficult get information on this kind of stuff.

GCC and Binutils have a couple of more patches that are somewhat more relevant for Linux, like the new -mfix-r5900 option.

Have you submitted the SQRT fix to Binutils?

Incidentally, I fixed another instruction bug in m68k/Bintuils a week ago, in commit 3b288c8e2ed35 m68k: Require m68020up rather than m68000up for CHK.L instruction. :)

By the way, how’s it going with the R5900 MMI auto-vectorisation?

@uyjulian
Copy link

I fear that 100+ (or so) reboots are needed to find the error so anything that can cut that number down is very much welcome.

If the issue occurs with -O1, you can binary search using the enabled flags from -O1. Otherwise, you can remove -O1 flags from -O2.

Binary search is faster than going through each of them one by one. Best case scenario is that you only have to try 7 times (for a total of 94 flags to search through).

@frno7
Copy link
Owner Author

frno7 commented Jan 17, 2021

Yes, I agree that’s a good plan, @uyjulian. :)

@rickgaiser
Copy link

Have you submitted the SQRT fix to Binutils?

Not yet, but I plan to.

By the way, how’s it going with the R5900 MMI auto-vectorisation?

Not much further than last time we spoke. I think I'll finish the MMI support by disabling MIPS-MSA. Rewriting both MIPS-MSA and R5900-MMI into a more generic form is too much for me to do alone, so I will not be submitting this to gcc.

Back on topic:
I've tried a simple hello world irx module using gcc9 (from @uyjulian ) and the ps2dev environment (code here). It also breaks when using anything other than -O0. The issue is in the macro's used for the import tables. It's a mixed bag of C structs with inline assembly:
https://github.com/ps2dev/ps2sdk/blob/4a2d422a1824cd2374a41726884d9b470015aa2c/iop/kernel/include/irx.h#L55-L72
gcc will reorder these, causing the import tables to become invalid.

But iopmod does not seem to be using this. I've taken a closer look at how you're creating the import tables, and wow; it's an amazing piece of art. I can learn a thing or 2 from there. But as for the import tables, you seem to be using assembly only:
https://github.com/frno7/iopmod/blob/master/include/iopmod/module-symbol.S
So I don't think gcc is reordering those.

@uyjulian
Copy link

I've tried a simple hello world irx module using gcc9 (from @uyjulian ) and the ps2dev environment (code here). It also breaks when using anything other than -O0.

Which specific -f optimization breaks it? If it's -ftoplevel-reorder, I think that was already disabled when compiled imports.o and exports.o, so it may be something else.

@frno7
Copy link
Owner Author

frno7 commented Jan 21, 2021

@rickgaiser, I’m glad you found it inspirational! I made an effort to put as much logic as possible into the tooling, so that writing the actual IOP modules becomes effortless. This means that a module function declaration such as

id_(0) int printk(const char *fmt, ...) __attribute__((format(printf, 1, 2)));

serves quadruple-duty: the id_ macro is used to produce both import and export directives, automatically, and the rest of the declaration is used by the C compiler as usual. The standard IOP modules are declared in the same way, for example sifcmd. The fourth duty is to display exports and imports, symbolically, in the iopmod-info tool, used to inspect IRX module files.

I wrote a small C language lexer in tool/lexc.c, which is actually complete, and a simple ELF parser in tool/elf.c for the tooling.

The GNU linker option --gc-sections ensures that only the relevant pieces of code are linked into the final module, which saves space without any manual effort. There is also a special automatic _module_init

enum module_init_status _module_init(int argc, char *argv[])
{
int err = loadcore_register_library(_module_export);
if (err < 0)
return MODULE_EXIT;
enum module_init_status status = _module_start(argc, argv);
if (status == MODULE_EXIT)
loadcore_release_library(_module_export);
return status;
}

which sets up the exports automatically, so modules don’t do that themselves. It then calls _module_start, which is declared with module_init in the module, as in for example

module_init(irqrelay_init);

Modules that don’t initialise anything don’t need to declare a module_init at all, so that simplification works too. The module/printk.c module for example only exports a function that prints in the Linux kernel log, similar to the kernel’s own printk.

The module/irqrelay.c module on the other hand sets up an RPC service, for relaying interrupts, so it needs the initialisation, but since it doesn’t export any functions its header file

// SPDX-License-Identifier: GPL-2.0
MODULE_ID(irqrelay, 0x0100);

is correspondingly simple, having only to declare the identification and the version of the module.

@frno7
Copy link
Owner Author

frno7 commented Jan 21, 2021

@rickgaiser, one more thing: The Linux kernel itself also understands IRX file module export and import directives, which means that the Linux kernel automatically prelinks any IOP modules from ROM and from files, as necessary, without manual effort, when an IOP module is requested with iop_module_request:

/**
 * iop_module_request - link requested IOP module unless it is already linked
 * @name: name of requested module
 * @version: requested version in BCD, where major must match with a least
 * 	the same minor
 * @arg: module arguments or %NULL
 *
 * Module library dependencies are resolved and prelinked as necessary. Module
 * files are handled as firmware by the IOP module linker.
 *
 * IOP module link requests are only permitted if the major versions match
 * and the version is at least of the same minor as the requested version.
 *
 * Context: mutex
 * Return: 0 on success, otherwise a negative error number
 */
int iop_module_request(const char *name, int version, const char *arg)

https://github.com/frno7/linux/blob/00a55c7e3a39f56e0e703a02dda5f52075b8f46a/drivers/ps2/iop-module-request.c#L856-L872

@frno7
Copy link
Owner Author

frno7 commented Jan 23, 2021

@rickgaiser, for the fun of it I ported your Hello world application in commit 149740e. It should work like your own, and with PS2DEV. Here’s what the iopmod-info tool says about it:

% tool/iopmod-info module/hello.irx
iopmod id addr		0xb0
iopmod id name		hello
iopmod id version	0x0100
iopmod entry addr	0x0
iopmod unknown addr	0xd0
iopmod text size	172
iopmod data size	0
iopmod bss size		0
iopmod version		0x0100
iopmod name		hello
import library	0x0102	stdio
import  0	 4	printf	stdio_printf

Notice that printf is actually an (undeclared but linkable) alias for stdio_printf, which in IOPMOD terms is considered to be its proper name by the declaration:

id_(4) int stdio_printf(const char *format, ...)
alias_(printf);

The disassembly with mipsr5900el-unknown-linux-gnu-objdump -dr module/hello.irx reveals _module_init that would do the exports, had there been anything to export, and its invocation of _module_start:

00000000 <_module_init>:
   0:	27bdffe8 	addiu	sp,sp,-24
   4:	afbf0014 	sw	ra,20(sp)
   8:	afa40018 	sw	a0,24(sp)
   c:	afa5001c 	sw	a1,28(sp)
  10:	8fa5001c 	lw	a1,28(sp)
  14:	8fa40018 	lw	a0,24(sp)
  18:	3c020000 	lui	v0,0x0
			18: R_MIPS_HI16	_module_start
  1c:	24420038 	addiu	v0,v0,56
			1c: R_MIPS_LO16	_module_start
  20:	0040f809 	jalr	v0
  24:	00000000 	nop
  28:	8fbf0014 	lw	ra,20(sp)
  2c:	27bd0018 	addiu	sp,sp,24
  30:	03e00008 	jr	ra
  34:	00000000 	nop

00000038 <_module_start>:
  38:	27bdffe8 	addiu	sp,sp,-24
  3c:	afbf0014 	sw	ra,20(sp)
  40:	afa40018 	sw	a0,24(sp)
  44:	afa5001c 	sw	a1,28(sp)
  48:	3c020000 	lui	v0,0x0
			48: R_MIPS_HI16	.rodata
  4c:	244400c0 	addiu	a0,v0,192
			4c: R_MIPS_LO16	.rodata
  50:	3c020000 	lui	v0,0x0
			50: R_MIPS_HI16	printf
  54:	2442009c 	addiu	v0,v0,156
			54: R_MIPS_LO16	printf
  58:	0040f809 	jalr	v0
  5c:	00000000 	nop
  60:	24020001 	li	v0,1
  64:	8fbf0014 	lw	ra,20(sp)
  68:	27bd0018 	addiu	sp,sp,24
  6c:	03e00008 	jr	ra
  70:	00000000 	nop
  74:	afa40000 	sw	a0,0(sp)
  78:	afa50004 	sw	a1,4(sp)
  7c:	00001025 	move	v0,zero
  80:	03e00008 	jr	ra
  84:	00000000 	nop
  88:	41e00000 	0x41e00000
  8c:	00000000 	nop
  90:	00000102 	srl	zero,zero,0x4
  94:	69647473 	0x69647473
  98:	0000006f 	0x6f

0000009c <printf>:
  9c:	03e00008 	jr	ra
  a0:	24000004 	li	zero,4
	...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants