A shadow stack is a second stack used to push the link register if it needs to be spilled to make a new procedure call. Leaf functions do not need to spill the link register.
The shadow stack, similar to the regular stack, grows downwards, i.e. from higher
addresses to lower addresses. Each entry on the shadow stack is XLEN
wide and
holds the link register value. The ssp
points to the top of the shadow stack,
i.e. address of the last element pushed on the shadow stack.
Note
|
Compilers when generating code for a CFI enabled program must protect the link
register, e.g. |
The backward-edge CFI extension introduces the following instructions for shadow
stack operations. All instructions are encoded using the SYSTEM major opcode and
using the mop.r
and mop.rr
instructions defined by the Zimops extension.
(mop.r) |
31 |
20 |
19 |
15 |
14 |
12 |
11 |
7 |
6 |
0 |
---|---|---|---|---|---|---|---|---|---|---|
mnemonic |
|
|
|
|
|
|||||
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|||||
|
|
|
|
|
(mop.rr) |
31 |
25 |
24 |
20 |
19 |
15 |
14 |
12 |
11 |
7 |
6 |
0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
mnemonic |
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
|
||||||
|
|
|
|
|
|
When menvcfg.CFI
is 0, then Zisslpcfi is not enabled for privilege modes less than
M and backward-edge CFI is not active at privilege levels less than M.
When V=0
and menvcfg.CFI
is 1, then backward-edge CFI is active at S-mode if
mstatus.SBCFIEN
is 1 and is active at U-mode if mstatus.UBCFIEN
is 1.
When henvcfg.CFI
is 0, then Zisslpcfi is not enabled for use when V=1
.
When V=1
and menvcfg.CFI
and henvcfg.CFI
are both 1, then backward-edge CFI
is active at S-mode if vsstatus.SBCFIEN
is 1 and is active at U-mode if
vsstatus.UBCFIEN
is 1.
The term xBCFIACT
is used to determine if backward-edge CFI is active at a
privilege level x
and is defined as follows:
xBCFIACT
determinationif ( privilege == M-mode )
xBCFIACT = 1
else if ( menvcfg.CFI == 1 && V == 0 && privilege == S-mode )
xBCFIACT = mstatus.SBCFIEN
else if ( menvcfg.CFI == 1 && V == 0 && privilege == U-mode )
xBCFIACT = mstatus.UBCFIEN
else if ( menvcfg.CFI == 1 && henvcfg.CFI == 1 & V == 1 && privilege == S-mode )
xBCFIACT = vsstatus.SBCFIEN
else if ( menvcfg.CFI == 1 && henvcfg.CFI == 1 & V == 1 && privilege == U-mode )
xBCFIACT = vsstatus.UBCFIEN
else
xBCFIACT = 0
When backward-edge CFI is not active(xBCFIACT = 0
):
-
The instructions defined for backward-edge CFI revert to their Zimops defined behavior and write 0 to [rd].
Note
|
When programs using shadow stack instructions are installed on a machine that supports the CFI extensions but the operating system installed does not enable the CFI extensions, the program continues to function due to Zimops defined behavior of writing 0 to [rd] and not causing an illegal-instruction exception. When programs using shadow stack instructions are installed on a machine that does not support the CFI extensions but support the Zimops extension, the program continues to function due to Zimops defined behavior of writing 0 to [rd] and not causing an illegal-instruction exception. On machines that do not support Zimops extension, the instructions cause an illegal-instruction exception. Installing programs that use the shadow stack instructions on such machines is not supported. |
A push operation is defined as decrement of the ssp
by XLEN
followed by a
write of the link register at the new top of the shadow stack. A pop operation
is defined as a XLEN
wide read from the current top of the shadow stack
followed by an increment of the ssp
.
To push a link register on the shadow stack, the CFI extension provides
sspush
instruction. To pop a link register from the shadow stack, the CFI
extension provides sspop
instruction.
Note
|
Programs operating in shadow stack mode store the return address to the data stack as well as the shadow stack in the function prologue. Such programs when returning from the function load the link register from the data stack and load a shadow copy of the link register from the shadow stack. The two values are then compared (using the sschkra instruction that is discussed later). If the values do not match it is indicative of a shadow stack violation and causes an illegal-instruction exception. The function prologue and epilog of a function using shadow stacks is as follows: function_entry: sspush x1 # push link register x1 on shadow stack : : sspop x5 # pop from shadow stack into x5 sschkra x1, x5 # compare link register x1 to shadow # link register x5; faults if not same ret When x1 is used by the ABI as the link register, the x5 may be used to load the shadow link register value from the shadow stack. Alternatively, if the ABI uses x5 as the link register, then the x1 register may be used as the shadow link register. A leaf function i.e. a function that does not itself make function calls does not need to push the link register to the shadow stack or pop it from the shadow stack. The return value may be held in the link register itself for the duration of the function execution. In the RVI psABI, the x1 register is designated as the link register and x5 register is designated as a temporary register. SInce the shadow link register is loaded at the tail of the function, prior to return, the x5 register being used as the shadow link register does not impose a burden on the compiler as the x5 register, being a temporary register that is not preserved across a call, is usually free for use at the tail of the function. Programs operating in the control stack mode may store the return address only to the shadow stack in the function prologue. Such functions when returning from the function load the link register value from the shadow stack. Such programs may either use the x1 or x5 register, depending on their ABI, as the link register. The hardware return-address prediction stacks detect the use of x1/x5 as the rd and x1/x5 as the source for a JALR instruction to infer if the JALR is used for a call or a return semantic for the purposes of prediction. To support both options the CFI extension provides the x1 and x5 variants of the shadow stack load and store. |
sspop
performs a load identically to the existing LOAD
instruction with the
difference that the base is implicitly ssp
, the width is implicitly XLEN
.
sspush
performs a store identically to the existing STORE
instruction with
the difference that the base is implicitly ssp
, the width is implicitly XLEN
.
The sspush
and sspop
require the virtual address in ssp
to have a shadow stack
attribute (see Shadow Stack Memory Protection).
If the virtual address in ssp
is not XLEN
aligned then the instructions cause a
load or store/AMO address-misaligned exception.
Note
|
Misaligned accesses to shadow stack are not required and enforcing alignment is more secure to detect errors in the program. |
The operation of the sspush
instructions is as follows:
sspush
operationIf (xBCFIACT = 1)
[ssp] = [ssp] - (XLEN/8) # decrement ssp by XLEN/8
*[ssp] = [src] # Store src value to address in ssp
else
[dst] = 0
endif`
The operation of the sspop
instructions is as follows:
sspop
operationIf (xBCFIACT = 1)
dst = *[ssp] # Load dst from address in ssp
[ssp] = [ssp] + (XLEN/8) # increment ssp by XLEN/8
else
[dst] = 0;
endif
Note
|
Store to load forwarding is a common technique employed by high performance
processor implementations. CFI implementations may restrict forwarding from a
non-shadow-stack store to a |
Note
|
A common operation performed on stacks is to unwind them to support constructs like setjmp/longjmp, C++ exception handling, etc. A program that uses shadow stacks must unwind the shadow stack in addition to the stack used to store data. The unwind function must verify that it does not accidentally unwind past the bounds of the shadow stack. Shadow stacks are expected to be bounded on each end using guard pages i.e. pages that do not have a shadow stack attribute. To detect if the unwind occurs past the bounds of the shadow stack the unwind may be done in maximal increments of 4 KiB and testing for the ssp to be still pointing to a shadow stack page or has unwound into the guard page. The following examples illustrate use of backward-edge CFI instructions to unwind a shadow stack. setjmp() { : : // read and save top of stack pointer to jmp_buf asm(“ssprr %0” : “=r”(cur_ssp):); jmp_buf->saved_ssp = cur_ssp; : : } longjmp() { : // Read current shadow stack pointer and // compute number of call frames to unwind asm(“ssprr %0” : “=r”(cur_ssp):); // Skip the unwind if backward-edge CFI not active asm(“beqz %0, 1f” : “=r”(cur_ssp):); num_unwind = jmp_buf->saved_ssp - cur_ssp; // Unwind the frames in a loop while ( num_unwind > 0 ) { step = ( num_unwind >= 4096 ) ? 4096 : num_unwind; cur_ssp += step; num_unwind -= step; // write the ssp register with unwound value asm(“csrw %0, $ssp_csr_num” : “=r”(cur_ssp):); // Test if unwound past the shadow stack bounds asm(“sspush x5”); asm(“sspop x5”); } 1f: : } |
The ssprr
instruction is provided to move the contents of ssp
to the destination
register.
The operation of the ssprr
instructions is as follows:
ssprr
operationIf (xBCFIACT = 1)
[dst] = [ssp]
else
[dst] = 0;
endif
Note
|
The property of Zimops writing 0 to the rd when the extension using Zimops is not present or not active may be used by such functions to skip over unwind actions by dynamically detecting if the backward-edge CFI extension is active. An example sequence such as the following may be used: ssprr t0 # mv ssp to t0 beqz bcfi_not_active # zero is not a valid shadow stack # pointer by convention # Shadow stacks active : : bcfi_not_active: |
Programs operating with a shadow stack push the return address onto the data stack as well as the shadow stack in the function prologue. Such programs when returning from the function pop the link register from the data stack and pop a shadow copy of the link register from the shadow stack. The two values are then compared. If the values do not match it is indicative of a corruption of the return address variable and the program causes an illegal instruction exception.
When x1 is used by the ABI as the link register, the x5 may be used to hold the shadow link register value from the shadow stack. Alternatively, if the ABI uses x5 as the link register, then the x1 register may be used as the shadow link register.
A sschkra
instruction is provided to perform the comparison.
The operation of the sschkra
instruction is as follows:
sschkra
operationIf (xBCFIACT = 1)
if [x1] != [x5]
Raise illegal-instruction exception
endif
else
[dst] = 0;
endif
The CFI extension defines an ssamoswap
instruction to atomically swap the XLEN
bits of src register with XLEN
bits on the shadow stack at address in addr
and
store the value from address in src
into register dst
.
The ssamoswap
is always sequentially consistent and cannot be reordered with
earlier or later memory operations from the same hart.
The ssamoswap
requires the virtual address in ssp
to have a shadow stack
attribute (see Shadow Stack Memory Protection).
If the virtual address is not XLEN
aligned then the instructions cause a
store/AMO address-misaligned exception.
The operation of the ssamoswap
instructions is as follows:
ssamoswap
operationIf (xBCFIACT = 1)
Perform the following atomically with sequential consistency
[dst] = *[addr]
*[addr] = [src]
else
[dst] = 0;
endif
Note
|
Stack switching is a common operation in user programs as well as supervisor programs. When a stack switch is performed the stack pointer of the currently active stack is saved into a context data structure and the new stack is made active by loading a new stack pointer from a context data structure. When shadow stacks are enabled for a program, the program needs to additionally switch the shadow stack pointer. The pointer to the top of the deactivated shadow stack if held in a context data structure may be susceptible to memory corruption vulnerabilities. To protect the pointer value the program may then store it at the top of the shadow stack itself and thus create a checkpoint. An example sequence to store and restore the shadow stack pointer is as follows: # The a0 register holds the pointer to top of new shadow # to switch to. The current ssp is first pushed on the current # shadow stack and the ssp is restored from new shadow stack save_shadow_stack_pointer: ssprr x5 # read ssp and push value onto sspush x5 # shadow stack. The [ssp] now # holds ssp+8. Save away x5 # into a context structure to # restore later. restore_shadow_stack_pointer: ssamoswap t0, a0, x0 # t0=*[ssp] and *[ssp]=0 addi t0, t0, (XLEN/8) # t0+XLEN/8 must match to a0 bnez t0, a0, crash # if not crash program csrw ssp, t0 # setup new ssp Further the program may enforce an invariant that a shadow stack can be active only on one hart by using the ssamoswap when performing the restore from the checkpoint such that the checkpointed data is zeroed as part of the restore sequence and multiple hart attempt to restore the checkpointed data only one of them succeeds. |
To protect shadow stack memory the memory is associated with a new page type - Shadow Stack (SS) page - in the page tables.
When the Smepmp extension is supported the PMP configuration registers are enhanced to support a shadow stack memory region for use by M-mode.
The shadow stack memory is protected using page table attributes such that it
cannot be stored to by instructions other than sspush
and amossswap
. The
sspop
instruction can only load from shadow stack memory.
The shadow stack can be read using all instructions that load from memory.
The encoding R=0
, W=1
, and X=0
, is defined to mean a shadow stack page.
When menvcfg.CFI=0
, this encoding continues to be reserved. When V=1
and
henvcfg.CFI=0
, this encoding continues to be reserved at VS
and VU
.
The following faults may occur:
-
If the accessed page is a shadow stack page
-
Stores other than
sspush
andssamoswap
cause write/AMO access faults. -
Instructions fetch causes a page fault
-
-
if the accessed page is not a shadow stack page
-
ssamoswap
andsspush
cause a store/AMO access fault -
sspop
causes a load access fault
-
To support these rules, the virtual address translation process specified in section 4.3.2 of the Privileged Specification cite:[PRIV] is modified as follows:
-
If
pte.v = 0
, of if any bits of encodings that are reserved for future standard use are set withinpte
, stop and raise a page-fault exception corresponding to the original access type. The encodingpte.xwr = 010b
is not reserved ifmenvcfg.CFI
is 1 or ifV=1
andhenvcfg.CFI
is 1. -
Otherwise, the PTE is valid. If
pte.r = 1
orpte.w = 1
orpte.x = 1
, go to step 5. Otherwise, this PTE is a pointer to the next level of the page table. Leti = i - 1
. Ifi < 0
, store and raise a page-fault exception corresponding to the original access type. Otherwise, leta = pte.ppn x PAGESIZE
and go to step 2. -
A leaf PTE has been found. If the memory access is by a shadow stack instruction and
pte.xwr != 010b
then cause an access-violation exception corresponding to the access type. If the memory acccess is a store/AMO andpte.xwr == 010b
then cause an store/AMO access-violation. If the requested memory access is not allowed by thepte.r
,pte.w
,pte.x
, andpte.u
bits, given the current privilege mode and the value of theSUM
andMXR
fields of themstatus
register, stop and raise a page-fault exception corresponding to the original access type.
The U
and SUM
bit enforcement is performed normally for shadow stack
instruction initiated memory accesses.
Svpbmt extension and Svnapot extensions are supported for shadow stack pages.
Note
|
Operating systems should protect against writeable non-shadow-stack alias virtual-addresses mappings being created to the shadow stack physical memory. |
The G-stage address translation and protections are not affected by the shadow
stack extension. When G-stage page tables are active, the ssamoswap
, and
sspop
instructions require the G-stage page table mapping the accessed memory to
have read permission and the ssamoswap
and sspush
instructions require write
permission. The SS encoding in the G-stage PTE remains reserved.
Note
|
A future extension may define shadow stack encoding the G-stage page table to support use cases such as a hypervisor enforcing shadow stack protections for virtual-supervisor. |
Note
|
All loads are allowed to read the shadow stack. The shadow stack only holds a copy of the link register as saved on the regular stack. The ability to read the shadow stack is useful for debug, performance profiling, and other use cases. |
When privilege mode is U or A, the PMP region accessed by sspush
and
ssamoswap
must provide write permission and the PMP region accessed by sspop
must provide read permission.
The M-mode memory accesses by sspush
and ssamoswap
instructions test for
write permission in the matching PMP entry when permission checking is required.
The M-mode memory accesses by sspop
instruction tests for read permission in
the matching PMP entry when permission checking is required.
When the Smepmp
extension is implemented, a new WARL field sspmp
is defined
in bits 8:3 of the mseccfg
CSR to configrue a PMP entry as the shadow stack
memory region for M-mode accesses.
When mseccfg.MML
is 1, the sspmp
field is read-only else it may be written.
When sspmp
field is implemented and mseccfg.MML
is 1 the following rules are
additionally enforcecd for M-mode memory accesses:
-
sspush
,sspop
, andssamoswap
instructions must match PMP entrysspmp
. -
Write by instructions other than
sspush
andssamoswap
that match PMP entrysspmp
cause an access violation exception.
Note
|
The PMP region used for the M-mode shadow stack is expected to be made inaccessible for U- and S-mode read and write accesses. Allowing write access violates the integrity of the shadow stack and allowing read access may lead to disclosure of M-mode return addresses. |