Skip to content

Initialize and check the stack canary at every io_tx/io_rx call#333

Closed
bigspider wants to merge 1 commit intomasterfrom
stack_canary
Closed

Initialize and check the stack canary at every io_tx/io_rx call#333
bigspider wants to merge 1 commit intomasterfrom
stack_canary

Conversation

@bigspider
Copy link
Copy Markdown
Contributor

@bigspider bigspider commented Jan 15, 2026

This adds the initialization and check for the stack canary at every io_rx and io_tx call.

Rather than having an explicit initialization that has to be called elsewhere, this this uses a single boolean global to track whether the canary is already initialized. This allows to keep the canary module entire self-contained, with no effect in other parts of the codebase - and it will later be trivial to remove it, if the C SDK implements an analogous protection.

A slight disadvantage is that stack smashing before the first APDU is not detected. However, that seems rather unlikely.

I used the same 0xDEAD0031 canary value as used in the C SDK, so that it shouldn't break even if both SDKs use the same canary (although of course it should be removed nonetheless once the C SDK has it (LedgerHQ/ledger-secure-sdk#1273)).

@bigspider bigspider changed the title Initialize and check the stack canary at every io_tx/io_rx call Initialize and check the stack canary at every io_tx/io_rx call Jan 15, 2026
@bigspider bigspider marked this pull request as ready for review January 15, 2026 16:56
Copy link
Copy Markdown
Contributor

@yogh333 yogh333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that's fine to detect any stack overflow when receiving/sending APDUs but not in the meantime :(
Could you please rebase your PR 🙏

@bigspider
Copy link
Copy Markdown
Contributor Author

Rebased.

Ok that's fine to detect any stack overflow when receiving/sending APDUs but not in the meantime :(

That's also how the feature used to be in the C SDK, by the way. Otherwise we'd have to add checks everywhere in the SDK - not maintainable.

@bigspider bigspider force-pushed the stack_canary branch 2 times, most recently from e46bac3 to 46f3d04 Compare February 5, 2026 09:34
Copy link
Copy Markdown
Contributor

@iartemov-ledger iartemov-ledger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with app-boilerplate-rust and that was not easy.

Actually we may have an inconsistency in https://github.com/LedgerHQ/ledger-device-rust-sdk/blob/master/ledger_secure_sdk_sys/link.ld#L114.
As it is now app_stack_canary is placed (Canary initialized at 0x5000269c) just after .bss zone (.bss: 0x50000000 - 0x5000269b (size: 9883 bytes) on Flex for the app), whereas the stack allocations happen very far from the canary:

cause_stack_overflow depth=10 sp=0x50008878 buf=0x50008880-0x50008916
cause_stack_overflow depth=9 sp=0x50008798 buf=0x500087a0-0x50008836
cause_stack_overflow depth=8 sp=0x500086b8 buf=0x500086c0-0x50008756
cause_stack_overflow depth=7 sp=0x500085d8 buf=0x500085e0-0x50008676
cause_stack_overflow depth=6 sp=0x500084f8 buf=0x50008500-0x50008596
...

Knowing that the SRAM size on Flex is fixed as 36K so [0x50000000 - 0x50009000] for Speculos, but stack size is defined as:

STACK_SIZE = 1500;

we may want to move app_stack_canary :

  .bss :
  {
    _bss = .;
    *(.bss*)
    _ebss = .;
    _bss_len = ABSOLUTE(_ebss) - ABSOLUTE(_bss);

    . = ALIGN(4);
    
    /* Position stack at the end of SRAM */
    _stack = ABSOLUTE(END_STACK) - STACK_SIZE;
    _estack = ABSOLUTE(END_STACK);
    
    /* Place canary just before the stack region for overflow detection */
    . = ABSOLUTE(_stack) - 4;
    app_stack_canary = .;
    _stack_validation = . + 4;
  } > SRAM :sram

Also when I finally provoked app_stack_canary corruption I got infinitely:

CANARY CORRUPTED! Expected: 0xdead0031, Got: 0x00000000
Canary value read: 0x00000000
CANARY CORRUPTED! Expected: 0xdead0031, Got: 0x00000000
Canary value read: 0x00000000
CANARY CORRUPTED! Expected: 0xdead0031, Got: 0x00000000
Canary value read: 0x00000000
CANARY CORRUPTED! Expected: 0xdead0031, Got: 0x00000000
...

Comment thread ledger_secure_sdk_sys/src/seph.rs
pub(super) fn init_and_check() {
unsafe {
if !CANARY_INITIALIZED {
init_canary();
Copy link
Copy Markdown
Contributor

@iartemov-ledger iartemov-ledger Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The laziness has pros (no explicit init is needed), but also cons (if the stack has been already corrupted the first lazy init will re-patch incorrect marker value).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, but I would prefer to avoid for the app to have a specific 'canary initialization' call - especially since we might later drop this feature once the C SDK implements it on its own.

If you have any idea of a clean way to make sure the canary is initialized at loading time, that would be a better solution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know Rust C as well. @yogh333 , may you propose anything?

@bigspider
Copy link
Copy Markdown
Contributor Author

we may want to move app_stack_canary :

The canary is before the stack section because the stack grows backwards from the end, so the canary has to be between the .bss variables and the stack.

About _stack and _estack, my understanding is that the stack is free to exceed the size at runtime (and grow to an address smaller than _stack. Nothing bad happens - the linker just guarantees that _estack - _stack is at least STACK_SIZE at linking time.

So the current position of the canary is such that it is only overwritten if the stack grows so much that it overlaps with .bss.

@iartemov-ledger
Copy link
Copy Markdown
Contributor

we may want to move app_stack_canary :

The canary is before the stack section because the stack grows backwards from the end, so the canary has to be between the .bss variables and the stack.

I understand that the stack grows downwards :)

About _stack and _estack, my understanding is that the stack is free to exceed the size at runtime (and grow to an address smaller than _stack. Nothing bad happens - the linker just guarantees that _estack - _stack is at least STACK_SIZE at linking time.

So the current position of the canary is such that it is only overwritten if the stack grows so much that it overlaps with .bss.

OK, you're right, as discussed in Slack. The STACK_SIZE name is confusing, and as your proposed it should be called MINIMUM_STACK_SIZE to raise a link error if bss+MINIMUM_STACK_SIZE do not fit in SRAM.
Other from that the stack can indeed grow up to the bss zone.

Copy link
Copy Markdown
Contributor

@iartemov-ledger iartemov-ledger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me as first approach.

@yogh333
Copy link
Copy Markdown
Contributor

yogh333 commented Feb 24, 2026

Duplicated as #358 (PR to master are now forbidden)

@yogh333 yogh333 closed this Feb 24, 2026
@yogh333 yogh333 deleted the stack_canary branch February 24, 2026 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants