Skip to content

Rust fails to optimize away useless unwrap check #57166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
upsuper opened this issue Dec 28, 2018 · 7 comments
Open

Rust fails to optimize away useless unwrap check #57166

upsuper opened this issue Dec 28, 2018 · 7 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@upsuper
Copy link
Contributor

upsuper commented Dec 28, 2018

See the following code:

pub struct ListNode {
    next: Option<Box<ListNode>>
}

pub fn foo(mut head: Box<ListNode>) -> Box<ListNode> {
    let mut cur = &mut head;
    while let Some(next) = std::mem::replace(&mut cur.next, None) {
        cur.next = Some(next);
        cur = cur.next.as_mut().unwrap();
    }
    head
}

Apparently the unwrap should never panic given the assignment in the previous line. However, based on the assembly output, that check isn't optimized away.

(The while let statement seems to be generating useless function calls for drop as well, btw.)

@Centril Centril added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Dec 28, 2018
@RustyYato
Copy link
Contributor

RustyYato commented Jul 15, 2019

Minimal example

This no-op

pub fn foo(x: String) -> String {
    Some(x).unwrap()
}

generates

core::ptr::real_drop_in_place:
	movq	%rdi, %rax
	movq	(%rdi), %rdi
	testq	%rdi, %rdi
	je	.LBB0_2
	movq	8(%rax), %rsi
	testq	%rsi, %rsi
	je	.LBB0_2
	movl	$1, %edx
	jmpq	*__rust_dealloc@GOTPCREL(%rip)

.LBB0_2:
	retq

playground::foo:
	pushq	%rbx
	subq	$32, %rsp
	movups	(%rsi), %xmm0
	movaps	%xmm0, (%rsp)
	movq	16(%rsi), %rcx
	movq	%rcx, 16(%rsp)
	cmpq	$0, (%rsp)
	je	.LBB1_1
	movq	%rdi, %rax
	movq	16(%rsp), %rcx
	movq	%rcx, 16(%rdi)
	movaps	(%rsp), %xmm0
	movups	%xmm0, (%rdi)
	addq	$32, %rsp
	popq	%rbx
	retq

.LBB1_1:
	leaq	.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.2(%rip), %rdi
	callq	*core::panicking::panic@GOTPCREL(%rip)
	ud2
	movq	%rax, %rbx
	movq	%rsp, %rdi
	callq	core::ptr::real_drop_in_place
	movq	%rbx, %rdi
	callq	_Unwind_Resume@PLT
	ud2

.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.0:
	.ascii	"called `Option::unwrap()` on a `None` value"

.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.1:
	.ascii	"src/libcore/option.rs"

.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.2:
	.quad	.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.0
	.asciz	"+\000\000\000\000\000\000"
	.quad	.Lanon.9b27569f3e5c884c6d6bffb1a8af3c8b.1
	.asciz	"\025\000\000\000\000\000\000\000[\001\000\000\025\000\000"

On playground for the latest nightly (2019-07-14 83e4eed)

@upsuper
Copy link
Contributor Author

upsuper commented Jul 16, 2019

That reminds me that this issue may be from the enum size optimization. When we don't have a separate field for the enum tag (in case like Option<Box<_>>), compiler doesn't make use of non-null guarantee when optimizing it.

To prove that, an even simpler testcase is:

use std::num::NonZeroUsize;

pub fn foo(x: NonZeroUsize) -> NonZeroUsize {
    Some(x).unwrap()
}

which produces the following assembly:

playground::foo:
	pushq	%rax
	testq	%rdi, %rdi
	je	.LBB0_1
	movq	%rdi, %rax
	popq	%rcx
	retq

.LBB0_1:
	leaq	.Lanon.c21f24ad05c622049a6df4006ef31e03.2(%rip), %rdi
	callq	*core::panicking::panic@GOTPCREL(%rip)
	ud2

.Lanon.c21f24ad05c622049a6df4006ef31e03.0:
	.ascii	"called `Option::unwrap()` on a `None` value"

.Lanon.c21f24ad05c622049a6df4006ef31e03.1:
	.ascii	"src/libcore/option.rs"

.Lanon.c21f24ad05c622049a6df4006ef31e03.2:
	.quad	.Lanon.c21f24ad05c622049a6df4006ef31e03.0
	.asciz	"+\000\000\000\000\000\000"
	.quad	.Lanon.c21f24ad05c622049a6df4006ef31e03.1
	.asciz	"\025\000\000\000\000\000\000\000[\001\000\000\025\000\000"

@AndreaCatania
Copy link

Hope that this will be improved soon considering the broad use of Option and the related functions unwrap, expect, ...

@klensy
Copy link
Contributor

klensy commented Aug 8, 2023

Slapping -Zmir-opt-level=3 solves issue, at least now https://rust.godbolt.org/z/6d8hW1PqW

Works for String and NonZeroUsize examples, but not for first message (this is -Zmir-enable-passes=+DataflowConstProp)

@saethlin saethlin added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such labels Nov 9, 2024
@saethlin
Copy link
Member

saethlin commented Nov 9, 2024

The minimized examples have been fixed since 1.77. I'm not completely certain that we have codegen tests for these.

But the original example still has the missing optimizations as described.

@saethlin saethlin added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Nov 9, 2024
@lolbinarycat
Copy link
Contributor

Here's a new MCVE:

pub fn foo() -> usize {
    let r = None..Some(30);
    return r.end.unwrap();
}

@scottmcm
Copy link
Member

scottmcm commented Apr 1, 2025

Hmm, the as_mut() in the OP makes it much harder, because things like #138759 won't be able to help as it's forced through memory.


@lolbinarycat Please include a godbolt or playground link with a repro. That works fine when I try it https://rust.godbolt.org/z/cnznoG4K4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants