diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 99d2596e0174d13ea9ac5f64ad3fbacccec87391..4fc9bf110f0796a1144989d9f5030e58d0f364a6 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5702,6 +5702,14 @@ replacement properties are not found. See the Kconfig entry for RISCV_ISA_FALLBACK. + riscv_nousercfi= + all Disable user CFI ABI to userspace even if cpu extension + are available. + bcfi Disable user backward CFI ABI to userspace even if + the shadow stack extension is available. + fcfi Disable user forward CFI ABI to userspace even if the + landing pad extension is available. + ro [KNL] Mount root device read-only on boot rodata= [KNL] diff --git a/Documentation/arch/riscv/index.rst b/Documentation/arch/riscv/index.rst index 4dab0cb4b900ff15470beb42b293debc9074197b..ed36e3019931c16536c1e9ede25717271d4a77e4 100644 --- a/Documentation/arch/riscv/index.rst +++ b/Documentation/arch/riscv/index.rst @@ -13,6 +13,8 @@ RISC-V architecture patch-acceptance uabi vector + zicfilp + zicfiss features diff --git a/Documentation/arch/riscv/zicfilp.rst b/Documentation/arch/riscv/zicfilp.rst new file mode 100644 index 0000000000000000000000000000000000000000..78a3e01ff68c0aec7477e1e06120ac6fe3524aa7 --- /dev/null +++ b/Documentation/arch/riscv/zicfilp.rst @@ -0,0 +1,122 @@ +.. SPDX-License-Identifier: GPL-2.0 + +:Author: Deepak Gupta +:Date: 12 January 2024 + +==================================================== +Tracking indirect control transfers on RISC-V Linux +==================================================== + +This document briefly describes the interface provided to userspace by Linux +to enable indirect branch tracking for user mode applications on RISC-V. + +1. Feature Overview +-------------------- + +Memory corruption issues usually result in crashes. However, in the +hands of a creative adversary, these can result in a variety of +security issues. + +Some of those security issues can be code re-use attacks, where an +adversary can use corrupt function pointers, chaining them together to +perform jump oriented programming (JOP) or call oriented programming +(COP) and thus compromise control flow integrity (CFI) of the program. + +Function pointers live in read-write memory and thus are susceptible +to corruption. This can allow an adversary to control the program +counter (PC) value. On RISC-V, the zicfilp extension enforces a +restriction on such indirect control transfers: + +- Indirect control transfers must land on a landing pad instruction ``lpad``. + There are two exceptions to this rule: + + - rs1 = x1 or rs1 = x5, i.e. a return from a function and returns are + protected using shadow stack (see zicfiss.rst) + + - rs1 = x7. On RISC-V, the compiler usually does the following to reach a + function which is beyond the offset of possible J-type instruction:: + + auipc x7, + jalr (x7) + + This form of indirect control transfer is immutable and doesn't + rely on memory. Thus rs1=x7 is exempted from tracking and + these are considered software guarded jumps. + +The ``lpad`` instruction is a pseudo-op of ``auipc rd, `` +with ``rd=x0``. This is a HINT op. The ``lpad`` instruction must be +aligned on a 4 byte boundary. It compares the 20 bit immediate with +x7. If ``imm_20bit`` == 0, the CPU doesn't perform any comparison with +``x7``. If ``imm_20bit`` != 0, then ``imm_20bit`` must match ``x7`` +else CPU will raise ``software check exception`` (``cause=18``) with +``*tval = 2``. + +The compiler can generate a hash over function signatures and set them +up (truncated to 20 bits) in x7 at callsites. Function prologues can +have ``lpad`` instructions encoded with the same function hash. This +further reduces the number of valid program counter addresses a call +site can reach. + +2. ELF and psABI +----------------- + +The toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_FCFI` for +property :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in the notes +section of the object file. + +3. Linux enabling +------------------ + +User space programs can have multiple shared objects loaded in their +address spaces. It's a difficult task to make sure all the +dependencies have been compiled with indirect branch support. Thus +it's left to the dynamic loader to enable indirect branch tracking for +the program. + +4. prctl() enabling +-------------------- + +:c:macro:`PR_SET_INDIR_BR_LP_STATUS` / :c:macro:`PR_GET_INDIR_BR_LP_STATUS` / +:c:macro:`PR_LOCK_INDIR_BR_LP_STATUS` are three prctls added to manage indirect +branch tracking. These prctls are architecture-agnostic and return -EINVAL if +the underlying functionality is not supported. + +* prctl(PR_SET_INDIR_BR_LP_STATUS, unsigned long arg) + +If arg1 is :c:macro:`PR_INDIR_BR_LP_ENABLE` and if CPU supports +``zicfilp`` then the kernel will enable indirect branch tracking for the +task. The dynamic loader can issue this :c:macro:`prctl` once it has +determined that all the objects loaded in the address space support +indirect branch tracking. Additionally, if there is a `dlopen` to an +object which wasn't compiled with ``zicfilp``, the dynamic loader can +issue this prctl with arg1 set to 0 (i.e. :c:macro:`PR_INDIR_BR_LP_ENABLE` +cleared). + +* prctl(PR_GET_INDIR_BR_LP_STATUS, unsigned long * arg) + +Returns the current status of indirect branch tracking. If enabled +it'll return :c:macro:`PR_INDIR_BR_LP_ENABLE` + +* prctl(PR_LOCK_INDIR_BR_LP_STATUS, unsigned long arg) + +Locks the current status of indirect branch tracking on the task. User +space may want to run with a strict security posture and wouldn't want +loading of objects without ``zicfilp`` support in them, to disallow +disabling of indirect branch tracking. In this case, user space can +use this prctl to lock the current settings. + +5. violations related to indirect branch tracking +-------------------------------------------------- + +Pertaining to indirect branch tracking, the CPU raises a software +check exception in the following conditions: + +- missing ``lpad`` after indirect call / jmp +- ``lpad`` not on 4 byte boundary +- ``imm_20bit`` embedded in ``lpad`` instruction doesn't match with ``x7`` + +In all 3 cases, ``*tval = 2`` is captured and software check exception is +raised (``cause=18``). + +The kernel will treat this as :c:macro:`SIGSEGV` with code = +:c:macro:`SEGV_CPERR` and follow the normal course of signal delivery. diff --git a/Documentation/arch/riscv/zicfiss.rst b/Documentation/arch/riscv/zicfiss.rst new file mode 100644 index 0000000000000000000000000000000000000000..4d5f7addc26d827dba96e530730f4b21d41c1b9b --- /dev/null +++ b/Documentation/arch/riscv/zicfiss.rst @@ -0,0 +1,194 @@ +.. SPDX-License-Identifier: GPL-2.0 + +:Author: Deepak Gupta +:Date: 12 January 2024 + +========================================================= +Shadow stack to protect function returns on RISC-V Linux +========================================================= + +This document briefly describes the interface provided to userspace by Linux +to enable shadow stacks for user mode applications on RISC-V. + +1. Feature Overview +-------------------- + +Memory corruption issues usually result in crashes. However, in the +hands of a creative adversary, these issues can result in a variety of +security problems. + +Some of those security issues can be code re-use attacks on programs +where an adversary can use corrupt return addresses present on the +stack. chaining them together to perform return oriented programming +(ROP) and thus compromising the control flow integrity (CFI) of the +program. + +Return addresses live on the stack in read-write memory. Therefore +they are susceptible to corruption, which allows an adversary to +control the program counter. On RISC-V, the ``zicfiss`` extension +provides an alternate stack (the "shadow stack") on which return +addresses can be safely placed in the prologue of the function and +retrieved in the epilogue. The ``zicfiss`` extension makes the +following changes: + +- PTE encodings for shadow stack virtual memory + An earlier reserved encoding in first stage translation i.e. + PTE.R=0, PTE.W=1, PTE.X=0 becomes the PTE encoding for shadow stack pages. + +- The ``sspush x1/x5`` instruction pushes (stores) ``x1/x5`` to shadow stack. + +- The ``sspopchk x1/x5`` instruction pops (loads) from shadow stack and compares + with ``x1/x5`` and if not equal, the CPU raises a ``software check exception`` + with ``*tval = 3`` + +The compiler toolchain ensures that function prologues have ``sspush +x1/x5`` to save the return address on shadow stack in addition to the +regular stack. Similarly, function epilogues have ``ld x5, +offset(x2)`` followed by ``sspopchk x5`` to ensure that a popped value +from the regular stack matches with the popped value from the shadow +stack. + +2. Shadow stack protections and linux memory manager +----------------------------------------------------- + +As mentioned earlier, shadow stacks get new page table encodings that +have some special properties assigned to them, along with instructions +that operate on the shadow stacks: + +- Regular stores to shadow stack memory raise store access faults. This + protects shadow stack memory from stray writes. + +- Regular loads from shadow stack memory are allowed. This allows + stack trace utilities or backtrace functions to read the true call + stack and ensure that it has not been tampered with. + +- Only shadow stack instructions can generate shadow stack loads or + shadow stack stores. + +- Shadow stack loads and stores on read-only memory raise AMO/store + page faults. Thus both ``sspush x1/x5`` and ``sspopchk x1/x5`` will + raise AMO/store page fault. This simplies COW handling in kernel + during fork(). The kernel can convert shadow stack pages into + read-only memory (as it does for regular read-write memory). As + soon as subsequent ``sspush`` or ``sspopchk`` instructions in + userspace are encountered, the kernel can perform COW. + +- Shadow stack loads and stores on read-write or read-write-execute + memory raise an access fault. This is a fatal condition because + shadow stack loads and stores should never be operating on + read-write or read-write-execute memory. + +3. ELF and psABI +----------------- + +The toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_BCFI` for +property :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in the notes +section of the object file. + +4. Linux enabling +------------------ + +User space programs can have multiple shared objects loaded in their +address space. It's a difficult task to make sure all the +dependencies have been compiled with shadow stack support. Thus +it's left to the dynamic loader to enable shadow stacks for the +program. + +5. prctl() enabling +-------------------- + +:c:macro:`PR_SET_SHADOW_STACK_STATUS` / :c:macro:`PR_GET_SHADOW_STACK_STATUS` / +:c:macro:`PR_LOCK_SHADOW_STACK_STATUS` are three prctls added to manage shadow +stack enabling for tasks. These prctls are architecture-agnostic and return +-EINVAL if not implemented. + +* prctl(PR_SET_SHADOW_STACK_STATUS, unsigned long arg) + +If arg = :c:macro:`PR_SHADOW_STACK_ENABLE` and if CPU supports +``zicfiss`` then the kernel will enable shadow stacks for the task. +The dynamic loader can issue this :c:macro:`prctl` once it has +determined that all the objects loaded in address space have support +for shadow stacks. Additionally, if there is a :c:macro:`dlopen` to +an object which wasn't compiled with ``zicfiss``, the dynamic loader +can issue this prctl with arg set to 0 (i.e. +:c:macro:`PR_SHADOW_STACK_ENABLE` being clear) + +* prctl(PR_GET_SHADOW_STACK_STATUS, unsigned long * arg) + +Returns the current status of indirect branch tracking. If enabled +it'll return :c:macro:`PR_SHADOW_STACK_ENABLE`. + +* prctl(PR_LOCK_SHADOW_STACK_STATUS, unsigned long arg) + +Locks the current status of shadow stack enabling on the +task. Userspace may want to run with a strict security posture and +wouldn't want loading of objects without ``zicfiss`` support. In this +case userspace can use this prctl to disallow disabling of shadow +stacks on the current task. + +5. violations related to returns with shadow stack enabled +----------------------------------------------------------- + +Pertaining to shadow stacks, the CPU raises a ``software check +exception`` upon executing ``sspopchk x1/x5`` if ``x1/x5`` doesn't +match the top of shadow stack. If a mismatch happens, then the CPU +sets ``*tval = 3`` and raises the exception. + +The Linux kernel will treat this as a :c:macro:`SIGSEGV` with code = +:c:macro:`SEGV_CPERR` and follow the normal course of signal delivery. + +6. Shadow stack tokens +----------------------- + +Regular stores on shadow stacks are not allowed and thus can't be +tampered with via arbitrary stray writes. However, one method of +pivoting / switching to a shadow stack is simply writing to the CSR +``CSR_SSP``. This will change the active shadow stack for the +program. Writes to ``CSR_SSP`` in the program should be mostly +limited to context switches, stack unwinds, or longjmp or similar +mechanisms (like context switching of Green Threads) in languages like +Go and Rust. CSR_SSP writes can be problematic because an attacker can +use memory corruption bugs and leverage context switching routines to +pivot to any shadow stack. Shadow stack tokens can help mitigate this +problem by making sure that: + +- When software is switching away from a shadow stack, the shadow + stack pointer should be saved on the shadow stack itself (this is + called the ``shadow stack token``). + +- When software is switching to a shadow stack, it should read the + ``shadow stack token`` from the shadow stack pointer and verify that + the ``shadow stack token`` itself is a pointer to the shadow stack + itself. + +- Once the token verification is done, software can perform the write + to ``CSR_SSP`` to switch shadow stacks. + +Here "software" could refer to the user mode task runtime itself, +managing various contexts as part of a single thread. Or "software" +could refer to the kernel, when the kernel has to deliver a signal to +a user task and must save the shadow stack pointer. The kernel can +perform similar procedure itself by saving a token on the user mode +task's shadow stack. This way, whenever :c:macro:`sigreturn` happens, +the kernel can read and verify the token and then switch to the shadow +stack. Using this mechanism, the kernel helps the user task so that +any corruption issue in the user task is not exploited by adversaries +arbitrarily using :c:macro:`sigreturn`. Adversaries will have to make +sure that there is a valid ``shadow stack token`` in addition to +invoking :c:macro:`sigreturn`. + +7. Signal shadow stack +----------------------- +The following structure has been added to sigcontext for RISC-V:: + + struct __sc_riscv_cfi_state { + unsigned long ss_ptr; + }; + +As part of signal delivery, the shadow stack token is saved on the +current shadow stack itself. The updated pointer is saved away in the +:c:macro:`ss_ptr` field in :c:macro:`__sc_riscv_cfi_state` under +:c:macro:`sigcontext`. The existing shadow stack allocation is used +for signal delivery. During :c:macro:`sigreturn`, kernel will obtain +:c:macro:`ss_ptr` from :c:macro:`sigcontext`, verify the saved +token on the shadow stack, and switch the shadow stack. diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 83b98065be723956bd4d71075e84474a01309424..f62008cdd968c2f9a3ba2c91240945850a256dc4 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -444,6 +444,20 @@ properties: The standard Zicboz extension for cache-block zeroing as ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs. + - const: zicfilp + description: | + The standard Zicfilp extension for enforcing forward edge + control-flow integrity as ratified in commit 3f8e450 ("merge + pull request #227 from ved-rivos/0709") of riscv-cfi + github repo. + + - const: zicfiss + description: | + The standard Zicfiss extension for enforcing backward edge + control-flow integrity as ratified in commit 3f8e450 ("merge + pull request #227 from ved-rivos/0709") of riscv-cfi + github repo. + - const: zicntr description: The standard Zicntr extension for base counters and timers, as diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index d74b3b7b20749f9115103bba0593463d876ad03e..0d93ededd104df03a9e5d07c6ecac8098cefd153 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -939,6 +939,28 @@ config RANDOMIZE_BASE If unsure, say N. +config RISCV_USER_CFI + def_bool y + bool "riscv userspace control flow integrity" + depends on 64BIT && MMU && \ + $(cc-option,-mabi=lp64 -march=rv64ima_zicfiss_zicfilp -fcf-protection=full) + depends on RISCV_ALTERNATIVE + select RISCV_SBI + select ARCH_HAS_USER_SHADOW_STACK + select ARCH_USES_HIGH_VMA_FLAGS + select DYNAMIC_SIGFRAME + help + Provides CPU-assisted control flow integrity to userspace tasks. + Control flow integrity is provided by implementing shadow stack for + backward edge and indirect branch tracking for forward edge. + Shadow stack protection is a hardware feature that detects function + return address corruption. This helps mitigate ROP attacks. + Indirect branch tracking enforces that all indirect branches must land + on a landing pad instruction else CPU will fault. This mitigates against + JOP / COP attacks. Applications must be enabled to use it, and old userspace + does not get protection "for free". + default y. + endmenu # "Kernel features" menu "Boot options" diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 578ca1f958f61f2cc122a6d583b8208fd1165b9c..2bbfa27e317a6022ce14c70cd0b7d319986b5e66 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -63,9 +63,12 @@ riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZACAS) := $(riscv-march-y)_zacas # Check if the toolchain supports Zabha riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZABHA) := $(riscv-march-y)_zabha +KBUILD_BASE_ISA = -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)fd([^v_]*)v?/\1\2/') +export KBUILD_BASE_ISA + # Remove F,D,V from isa string for all. Keep extensions between "fd" and "v" by # matching non-v and non-multi-letter extensions out with the filter ([^v_]*) -KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)fd([^v_]*)v?/\1\2/') +KBUILD_CFLAGS += $(KBUILD_BASE_ISA) KBUILD_AFLAGS += -march=$(riscv-march-y) @@ -127,6 +130,8 @@ ifeq ($(CONFIG_MMU),y) prepare: vdso_prepare vdso_prepare: prepare0 $(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso include/generated/vdso-offsets.h + $(if $(CONFIG_RISCV_USER_CFI),$(Q)$(MAKE) \ + $(build)=arch/riscv/kernel/vdso_cfi include/generated/vdso-cfi-offsets.h) $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \ $(build)=arch/riscv/kernel/compat_vdso include/generated/compat_vdso-offsets.h) @@ -134,6 +139,7 @@ endif endif vdso-install-y += arch/riscv/kernel/vdso/vdso.so.dbg +vdso-install-$(CONFIG_RISCV_USER_CFI) += arch/riscv/kernel/vdso_cfi/vdso-cfi.so.dbg vdso-install-$(CONFIG_COMPAT) += arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg:../compat_vdso/compat_vdso.so ifneq ($(CONFIG_XIP_KERNEL),y) diff --git a/arch/riscv/configs/hardening.config b/arch/riscv/configs/hardening.config new file mode 100644 index 0000000000000000000000000000000000000000..089f4cee82f4d23572bf28acd63a9c3b33ff4247 --- /dev/null +++ b/arch/riscv/configs/hardening.config @@ -0,0 +1,4 @@ +# RISCV specific kernel hardening options + +# Enable control flow integrity support for usermode. +CONFIG_RISCV_USER_CFI=y diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index cd627ec289f163a630b73dd03dd52a6b28692997..5a27cefd7805d669a663377d8a8157c080ee4319 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -51,6 +51,7 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_u); DECLARE_DO_ERROR_INFO(do_trap_ecall_s); DECLARE_DO_ERROR_INFO(do_trap_ecall_m); DECLARE_DO_ERROR_INFO(do_trap_break); +DECLARE_DO_ERROR_INFO(do_trap_software_check); asmlinkage void handle_bad_stack(struct pt_regs *regs); asmlinkage void do_page_fault(struct pt_regs *regs); diff --git a/arch/riscv/include/asm/assembler.h b/arch/riscv/include/asm/assembler.h index 44b1457d3e95678a5600e2d03cb86381d1bea367..4bdd8b8e69cb6e23150ba026a8071fbf59d0bcde 100644 --- a/arch/riscv/include/asm/assembler.h +++ b/arch/riscv/include/asm/assembler.h @@ -80,3 +80,47 @@ .endm #endif /* __ASM_ASSEMBLER_H */ + +#if defined(VDSO_CFI) && (__riscv_xlen == 64) +.macro vdso_lpad, label = 0 +lpad \label +.endm +#else +.macro vdso_lpad, label = 0 +.endm +#endif + +/* + * This macro emits a program property note section identifying + * architecture features which require special handling, mainly for + * use in assembly files included in the VDSO. + */ +#define NT_GNU_PROPERTY_TYPE_0 5 +#define GNU_PROPERTY_RISCV_FEATURE_1_AND 0xc0000000 + +#define GNU_PROPERTY_RISCV_FEATURE_1_ZICFILP BIT(0) +#define GNU_PROPERTY_RISCV_FEATURE_1_ZICFISS BIT(1) + +#if defined(VDSO_CFI) && (__riscv_xlen == 64) +#define GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT \ + (GNU_PROPERTY_RISCV_FEATURE_1_ZICFILP | GNU_PROPERTY_RISCV_FEATURE_1_ZICFISS) +#endif + +#ifdef GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT +.macro emit_riscv_feature_1_and, feat = GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT + .pushsection .note.gnu.property, "a" + .p2align 3 + .word 4 + .word 16 + .word NT_GNU_PROPERTY_TYPE_0 + .asciz "GNU" + .word GNU_PROPERTY_RISCV_FEATURE_1_AND + .word 4 + .word \feat + .word 0 + .popsection +.endm +#else +.macro emit_riscv_feature_1_and, feat = 0 +.endm +#endif diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 2cb596064b9f8c1b1f1ece736cf276154e6e49b9..80f9b006288f77507d19c839c2994336c6496f32 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -70,4 +70,16 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); } +static inline bool cpu_supports_shadow_stack(void) +{ + return (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICFISS)); +} + +static inline bool cpu_supports_indirect_br_lp_instr(void) +{ + return (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICFILP)); +} + #endif diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 37bdea65bbd8a1a313cc7ba00b80fc5071b9809a..6ba5e08f6559fc077c7e7fbcd2409406da08ccf5 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -18,6 +18,15 @@ #define SR_MPP _AC(0x00001800, UL) /* Previously Machine */ #define SR_SUM _AC(0x00040000, UL) /* Supervisor User Memory Access */ +/* zicfilp landing pad status bit */ +#define SR_SPELP _AC(0x00800000, UL) +#define SR_MPELP _AC(0x020000000000, UL) +#ifdef CONFIG_RISCV_M_MODE +#define SR_ELP SR_MPELP +#else +#define SR_ELP SR_SPELP +#endif + #define SR_FS _AC(0x00006000, UL) /* Floating-point Status */ #define SR_FS_OFF _AC(0x00000000, UL) #define SR_FS_INITIAL _AC(0x00002000, UL) @@ -206,6 +215,8 @@ #define ENVCFG_PMM_PMLEN_16 (_AC(0x3, ULL) << 32) #define ENVCFG_CBZE (_AC(1, UL) << 7) #define ENVCFG_CBCFE (_AC(1, UL) << 6) +#define ENVCFG_LPE (_AC(1, UL) << 2) +#define ENVCFG_SSE (_AC(1, UL) << 3) #define ENVCFG_CBIE_SHIFT 4 #define ENVCFG_CBIE (_AC(0x3, UL) << ENVCFG_CBIE_SHIFT) #define ENVCFG_CBIE_ILL _AC(0x0, UL) @@ -315,6 +326,9 @@ #define CSR_STIMECMP 0x14D #define CSR_STIMECMPH 0x15D +/* zicfiss user mode csr. CSR_SSP holds current shadow stack pointer */ +#define CSR_SSP 0x011 + /* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */ #define CSR_SISELECT 0x150 #define CSR_SIREG 0x151 diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h index b75b39a4fc100764f0ef1f940872f75dd8bcaa28..34ed149af5d1b4335ea212255fce29d4a6af353a 100644 --- a/arch/riscv/include/asm/entry-common.h +++ b/arch/riscv/include/asm/entry-common.h @@ -25,4 +25,21 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, void handle_page_fault(struct pt_regs *regs); void handle_break(struct pt_regs *regs); +#ifdef CONFIG_RISCV_MISALIGNED +int handle_misaligned_load(struct pt_regs *regs); +int handle_misaligned_store(struct pt_regs *regs); +#else +static inline int handle_misaligned_load(struct pt_regs *regs) +{ + return -1; +} + +static inline int handle_misaligned_store(struct pt_regs *regs) +{ + return -1; +} +#endif + +bool handle_user_cfi_violation(struct pt_regs *regs); + #endif /* _ASM_RISCV_ENTRY_COMMON_H */ diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index eb0bfde87b899971954cb32ff4d8fa00913159f4..854d8cf8d0d418ec4ff4ae9fa48935acf02e3177 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -113,6 +113,12 @@ #define RISCV_ISA_EXT_ZALRSC 98 #define RISCV_ISA_EXT_ZICBOP 99 #define RISCV_ISA_EXT_XANDESPMU 100 +#define RISCV_ISA_EXT_SVRSW60T59B 101 +#define RISCV_ISA_EXT_ZALASR 102 +#define RISCV_ISA_EXT_ZILSD 103 +#define RISCV_ISA_EXT_ZCLSD 104 +#define RISCV_ISA_EXT_ZICFILP 105 +#define RISCV_ISA_EXT_ZICFISS 106 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/include/asm/mman.h b/arch/riscv/include/asm/mman.h new file mode 100644 index 0000000000000000000000000000000000000000..0ad1d19832ebc1e639b715fb25eb2fa5d8cb95fa --- /dev/null +++ b/arch/riscv/include/asm/mman.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_MMAN_H__ +#define __ASM_MMAN_H__ + +#include +#include +#include +#include + +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, + unsigned long pkey __always_unused) +{ + unsigned long ret = 0; + + /* + * If PROT_WRITE was specified, force it to VM_READ | VM_WRITE. + * Only VM_WRITE means shadow stack. + */ + if (prot & PROT_WRITE) + ret = (VM_READ | VM_WRITE); + return ret; +} + +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) + +#endif /* ! __ASM_MMAN_H__ */ diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h index 8c4bc49a3a0f5b8949a030e9ae7421052136ae34..dbf27a78df6c87a19fc79a72b505cb85e67226fd 100644 --- a/arch/riscv/include/asm/mmu_context.h +++ b/arch/riscv/include/asm/mmu_context.h @@ -48,6 +48,13 @@ static inline unsigned long mm_untag_mask(struct mm_struct *mm) } #endif +#define deactivate_mm deactivate_mm +static inline void deactivate_mm(struct task_struct *tsk, + struct mm_struct *mm) +{ + shstk_release(tsk); +} + #include #endif /* _ASM_RISCV_MMU_CONTEXT_H */ diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index cbd5afec45886b0bffeb5ab1a4a64eb14061abc7..8eecaa639d04af07d1fdf10bca54075f72a78f65 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -185,6 +185,7 @@ extern struct pt_alloc_ops pt_ops __meminitdata; #define PAGE_READ_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC) #define PAGE_WRITE_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | \ _PAGE_EXEC | _PAGE_WRITE) +#define PAGE_SHADOWSTACK __pgprot(_PAGE_BASE | _PAGE_WRITE) #define PAGE_COPY PAGE_READ #define PAGE_COPY_EXEC PAGE_READ_EXEC @@ -404,16 +405,60 @@ static inline int pte_devmap(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { - return __pte(pte_val(pte) & ~(_PAGE_WRITE)); + return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ)); } +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pgtable_supports_uffd_wp() \ + riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B) + +static inline bool pte_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_UFFD_WP); +} + +static inline pte_t pte_mkuffd_wp(pte_t pte) +{ + return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP)); +} + +static inline pte_t pte_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP)); +} + +static inline bool pte_swp_uffd_wp(pte_t pte) +{ + return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_mkuffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP); +} + +static inline pte_t pte_swp_clear_uffd_wp(pte_t pte) +{ + return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP)); +} +#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ + /* static inline pte_t pte_mkread(pte_t pte) */ +struct vm_area_struct; +pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma); +#define pte_mkwrite pte_mkwrite + static inline pte_t pte_mkwrite_novma(pte_t pte) { return __pte(pte_val(pte) | _PAGE_WRITE); } +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + return __pte((pte_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE); +} + /* static inline pte_t pte_mkexec(pte_t pte) */ static inline pte_t pte_mkdirty(pte_t pte) @@ -596,7 +641,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) { - atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep); + pte_t read_pte = READ_ONCE(*ptep); + /* + * ptep_set_wrprotect can be called for shadow stack ranges too. + * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to + * encoding 000b which is wrong encoding with V = 1. This should lead to page fault + * but we dont want this wrong configuration to be set in page tables. + */ + atomic_long_set((atomic_long_t *)ptep, + ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ)); } #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH @@ -724,11 +777,19 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) return pte_pmd(pte_mkyoung(pmd_pte(pmd))); } +pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); +#define pmd_mkwrite pmd_mkwrite + static inline pmd_t pmd_mkwrite_novma(pmd_t pmd) { return pte_pmd(pte_mkwrite_novma(pmd_pte(pmd))); } +static inline pmd_t pmd_mkwrite_shstk(pmd_t pte) +{ + return __pmd((pmd_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE); +} + static inline pmd_t pmd_wrprotect(pmd_t pmd) { return pte_pmd(pte_wrprotect(pmd_pte(pmd))); diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 4c1d6c39169c58b9eef08cc5eec7c5c8affbd81e..f413f6dd02e2d2984558755582712e68d0342139 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -16,6 +16,7 @@ #include #include #include +#include #define arch_get_mmap_end(addr, len, flags) \ ({ \ diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 3a525aabe7df72a66b0e5eda70b8eafd7f9aa311..ada371ec3d654439c06592cecd3de3d437519c31 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -34,6 +34,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_STA = 0x535441, SBI_EXT_SSE = 0x535345, + SBI_EXT_FWFT = 0x46574654, /* Experimentals extensions must lie within this range */ SBI_EXT_EXPERIMENTAL_START = 0x08000000, @@ -340,6 +341,152 @@ struct sbi_sta_struct { #define SBI_SHMEM_DISABLE -1 +enum sbi_ext_nacl_fid { + SBI_EXT_NACL_PROBE_FEATURE = 0x0, + SBI_EXT_NACL_SET_SHMEM = 0x1, + SBI_EXT_NACL_SYNC_CSR = 0x2, + SBI_EXT_NACL_SYNC_HFENCE = 0x3, + SBI_EXT_NACL_SYNC_SRET = 0x4, +}; + +enum sbi_ext_nacl_feature { + SBI_NACL_FEAT_SYNC_CSR = 0x0, + SBI_NACL_FEAT_SYNC_HFENCE = 0x1, + SBI_NACL_FEAT_SYNC_SRET = 0x2, + SBI_NACL_FEAT_AUTOSWAP_CSR = 0x3, +}; + +#define SBI_NACL_SHMEM_ADDR_SHIFT 12 +#define SBI_NACL_SHMEM_SCRATCH_OFFSET 0x0000 +#define SBI_NACL_SHMEM_SCRATCH_SIZE 0x1000 +#define SBI_NACL_SHMEM_SRET_OFFSET 0x0000 +#define SBI_NACL_SHMEM_SRET_SIZE 0x0200 +#define SBI_NACL_SHMEM_AUTOSWAP_OFFSET (SBI_NACL_SHMEM_SRET_OFFSET + \ + SBI_NACL_SHMEM_SRET_SIZE) +#define SBI_NACL_SHMEM_AUTOSWAP_SIZE 0x0080 +#define SBI_NACL_SHMEM_UNUSED_OFFSET (SBI_NACL_SHMEM_AUTOSWAP_OFFSET + \ + SBI_NACL_SHMEM_AUTOSWAP_SIZE) +#define SBI_NACL_SHMEM_UNUSED_SIZE 0x0580 +#define SBI_NACL_SHMEM_HFENCE_OFFSET (SBI_NACL_SHMEM_UNUSED_OFFSET + \ + SBI_NACL_SHMEM_UNUSED_SIZE) +#define SBI_NACL_SHMEM_HFENCE_SIZE 0x0780 +#define SBI_NACL_SHMEM_DBITMAP_OFFSET (SBI_NACL_SHMEM_HFENCE_OFFSET + \ + SBI_NACL_SHMEM_HFENCE_SIZE) +#define SBI_NACL_SHMEM_DBITMAP_SIZE 0x0080 +#define SBI_NACL_SHMEM_CSR_OFFSET (SBI_NACL_SHMEM_DBITMAP_OFFSET + \ + SBI_NACL_SHMEM_DBITMAP_SIZE) +#define SBI_NACL_SHMEM_CSR_SIZE ((__riscv_xlen / 8) * 1024) +#define SBI_NACL_SHMEM_SIZE (SBI_NACL_SHMEM_CSR_OFFSET + \ + SBI_NACL_SHMEM_CSR_SIZE) + +#define SBI_NACL_SHMEM_CSR_INDEX(__csr_num) \ + ((((__csr_num) & 0xc00) >> 2) | ((__csr_num) & 0xff)) + +#define SBI_NACL_SHMEM_HFENCE_ENTRY_SZ ((__riscv_xlen / 8) * 4) +#define SBI_NACL_SHMEM_HFENCE_ENTRY_MAX \ + (SBI_NACL_SHMEM_HFENCE_SIZE / \ + SBI_NACL_SHMEM_HFENCE_ENTRY_SZ) +#define SBI_NACL_SHMEM_HFENCE_ENTRY(__num) \ + (SBI_NACL_SHMEM_HFENCE_OFFSET + \ + (__num) * SBI_NACL_SHMEM_HFENCE_ENTRY_SZ) +#define SBI_NACL_SHMEM_HFENCE_ENTRY_CONFIG(__num) \ + SBI_NACL_SHMEM_HFENCE_ENTRY(__num) +#define SBI_NACL_SHMEM_HFENCE_ENTRY_PNUM(__num)\ + (SBI_NACL_SHMEM_HFENCE_ENTRY(__num) + (__riscv_xlen / 8)) +#define SBI_NACL_SHMEM_HFENCE_ENTRY_PCOUNT(__num)\ + (SBI_NACL_SHMEM_HFENCE_ENTRY(__num) + \ + ((__riscv_xlen / 8) * 3)) + +#define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS 1 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_SHIFT \ + (__riscv_xlen - SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS) +#define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_MASK \ + ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS) - 1) +#define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND \ + (SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_MASK << \ + SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_SHIFT) + +#define SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD1_BITS 3 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD1_SHIFT \ + (SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_SHIFT - \ + SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD1_BITS) + +#define SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS 4 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_SHIFT \ + (SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD1_SHIFT - \ + SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS) +#define SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_MASK \ + ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS) - 1) + +#define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA 0x0 +#define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA_ALL 0x1 +#define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA_VMID 0x2 +#define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA_VMID_ALL 0x3 +#define SBI_NACL_SHMEM_HFENCE_TYPE_VVMA 0x4 +#define SBI_NACL_SHMEM_HFENCE_TYPE_VVMA_ALL 0x5 +#define SBI_NACL_SHMEM_HFENCE_TYPE_VVMA_ASID 0x6 +#define SBI_NACL_SHMEM_HFENCE_TYPE_VVMA_ASID_ALL 0x7 + +#define SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD2_BITS 1 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD2_SHIFT \ + (SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_SHIFT - \ + SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD2_BITS) + +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS 7 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_SHIFT \ + (SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD2_SHIFT - \ + SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS) +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_MASK \ + ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS) - 1) +#define SBI_NACL_SHMEM_HFENCE_ORDER_BASE 12 + +#if __riscv_xlen == 32 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS 9 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_BITS 7 +#else +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS 16 +#define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_BITS 14 +#endif +#define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_SHIFT \ + SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS +#define SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_MASK \ + ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS) - 1) +#define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_MASK \ + ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_BITS) - 1) + +#define SBI_NACL_SHMEM_AUTOSWAP_FLAG_HSTATUS BIT(0) +#define SBI_NACL_SHMEM_AUTOSWAP_HSTATUS ((__riscv_xlen / 8) * 1) + +#define SBI_NACL_SHMEM_SRET_X(__i) ((__riscv_xlen / 8) * (__i)) +#define SBI_NACL_SHMEM_SRET_X_LAST 31 + +/* SBI function IDs for FW feature extension */ +#define SBI_EXT_FWFT_SET 0x0 +#define SBI_EXT_FWFT_GET 0x1 + +enum sbi_fwft_feature_t { + SBI_FWFT_MISALIGNED_EXC_DELEG = 0x0, + SBI_FWFT_LANDING_PAD = 0x1, + SBI_FWFT_SHADOW_STACK = 0x2, + SBI_FWFT_DOUBLE_TRAP = 0x3, + SBI_FWFT_PTE_AD_HW_UPDATING = 0x4, + SBI_FWFT_POINTER_MASKING_PMLEN = 0x5, + SBI_FWFT_LOCAL_RESERVED_START = 0x6, + SBI_FWFT_LOCAL_RESERVED_END = 0x3fffffff, + SBI_FWFT_LOCAL_PLATFORM_START = 0x40000000, + SBI_FWFT_LOCAL_PLATFORM_END = 0x7fffffff, + + SBI_FWFT_GLOBAL_RESERVED_START = 0x80000000, + SBI_FWFT_GLOBAL_RESERVED_END = 0xbfffffff, + SBI_FWFT_GLOBAL_PLATFORM_START = 0xc0000000, + SBI_FWFT_GLOBAL_PLATFORM_END = 0xffffffff, +}; + +#define SBI_FWFT_PLATFORM_FEATURE_BIT BIT(30) +#define SBI_FWFT_GLOBAL_FEATURE_BIT BIT(31) + +#define SBI_FWFT_SET_FLAG_LOCK BIT(0) + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 @@ -357,6 +504,11 @@ struct sbi_sta_struct { #define SBI_ERR_ALREADY_STARTED -7 #define SBI_ERR_ALREADY_STOPPED -8 #define SBI_ERR_NO_SHMEM -9 +#define SBI_ERR_INVALID_STATE -10 +#define SBI_ERR_BAD_RANGE -11 +#define SBI_ERR_TIMEOUT -12 +#define SBI_ERR_IO -13 +#define SBI_ERR_DENIED_LOCKED -14 extern unsigned long sbi_spec_version; struct sbiret { diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index 59357bb8ffb3421d5212e054348e8aa13c8ce2ca..f5d5e54172ea406ebefe90007b637d490b4a5c9e 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -69,6 +69,9 @@ struct thread_info { */ unsigned long a0, a1, a2; #endif +#ifdef CONFIG_RISCV_USER_CFI + struct cfi_state user_cfi_state; +#endif }; /* diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h new file mode 100644 index 0000000000000000000000000000000000000000..7495baae1e3c2a21b5a481e6324b77749c139380 --- /dev/null +++ b/arch/riscv/include/asm/usercfi.h @@ -0,0 +1,97 @@ +/* SPDX-License-Identifier: GPL-2.0 + * Copyright (C) 2024 Rivos, Inc. + * Deepak Gupta + */ +#ifndef _ASM_RISCV_USERCFI_H +#define _ASM_RISCV_USERCFI_H + +#define CMDLINE_DISABLE_RISCV_USERCFI_FCFI 1 +#define CMDLINE_DISABLE_RISCV_USERCFI_BCFI 2 +#define CMDLINE_DISABLE_RISCV_USERCFI 3 + +#ifndef __ASSEMBLER__ +#include +#include +#include + +struct task_struct; +struct kernel_clone_args; + +extern unsigned long riscv_nousercfi; + +#ifdef CONFIG_RISCV_USER_CFI +struct cfi_state { + unsigned long ubcfi_en : 1; /* Enable for backward cfi. */ + unsigned long ubcfi_locked : 1; + unsigned long ufcfi_en : 1; /* Enable for forward cfi. Note that ELP goes in sstatus */ + unsigned long ufcfi_locked : 1; + unsigned long user_shdw_stk; /* Current user shadow stack pointer */ + unsigned long shdw_stk_base; /* Base address of shadow stack */ + unsigned long shdw_stk_size; /* size of shadow stack */ +}; + +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args); +void shstk_release(struct task_struct *tsk); +void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size); +unsigned long get_shstk_base(struct task_struct *task, unsigned long *size); +void set_active_shstk(struct task_struct *task, unsigned long shstk_addr); +bool is_shstk_enabled(struct task_struct *task); +bool is_shstk_locked(struct task_struct *task); +bool is_shstk_allocated(struct task_struct *task); +void set_shstk_lock(struct task_struct *task); +void set_shstk_status(struct task_struct *task, bool enable); +unsigned long get_active_shstk(struct task_struct *task); +int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr); +int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr); +bool is_indir_lp_enabled(struct task_struct *task); +bool is_indir_lp_locked(struct task_struct *task); +void set_indir_lp_status(struct task_struct *task, bool enable); +void set_indir_lp_lock(struct task_struct *task); + +#define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK (PR_SHADOW_STACK_ENABLE) + +#else + +#define shstk_alloc_thread_stack(tsk, args) 0 + +#define shstk_release(tsk) + +#define get_shstk_base(task, size) 0UL + +#define set_shstk_base(task, shstk_addr, size) do {} while (0) + +#define set_active_shstk(task, shstk_addr) do {} while (0) + +#define is_shstk_enabled(task) false + +#define is_shstk_locked(task) false + +#define is_shstk_allocated(task) false + +#define set_shstk_lock(task) do {} while (0) + +#define set_shstk_status(task, enable) do {} while (0) + +#define is_indir_lp_enabled(task) false + +#define is_indir_lp_locked(task) false + +#define set_indir_lp_status(task, enable) do {} while (0) + +#define set_indir_lp_lock(task) do {} while (0) + +#define restore_user_shstk(tsk, shstk_ptr) -EINVAL + +#define save_user_shstk(tsk, saved_shstk_ptr) -EINVAL + +#define get_active_shstk(task) 0UL + +#endif /* CONFIG_RISCV_USER_CFI */ + +bool is_user_shstk_enabled(void); +bool is_user_lpad_enabled(void); + +#endif /* __ASSEMBLER__ */ + +#endif /* _ASM_RISCV_USERCFI_H */ diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h index f891478829a52c41e06240f67611694cc28197d9..aa6355e18b60f8314ce2e840dec5009926082553 100644 --- a/arch/riscv/include/asm/vdso.h +++ b/arch/riscv/include/asm/vdso.h @@ -18,9 +18,19 @@ #ifndef __ASSEMBLY__ #include +#ifdef CONFIG_RISCV_USER_CFI +#include +#endif +#ifdef CONFIG_RISCV_USER_CFI #define VDSO_SYMBOL(base, name) \ - (void __user *)((unsigned long)(base) + __vdso_##name##_offset) + (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZIMOP) ? \ + (void __user *)((unsigned long)(base) + __vdso_##name##_cfi_offset) : \ + (void __user *)((unsigned long)(base) + __vdso_##name##_offset)) +#else +#define VDSO_SYMBOL(base, name) \ + ((void __user *)((unsigned long)(base) + __vdso_##name##_offset)) +#endif #ifdef CONFIG_COMPAT #include @@ -33,6 +43,7 @@ extern char compat_vdso_start[], compat_vdso_end[]; #endif /* CONFIG_COMPAT */ extern char vdso_start[], vdso_end[]; +extern char vdso_cfi_start[], vdso_cfi_end[]; #endif /* !__ASSEMBLY__ */ diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index be7d309cca8a78d3963ae42d4b55fda89b8ab9dc..2d2ec6ca3abb90a5e83607b0615bcc8ad009bf66 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -281,6 +281,9 @@ static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; } #define riscv_v_thread_free(tsk) do {} while (0) #define riscv_v_setup_ctx_cache() do {} while (0) #define riscv_v_thread_alloc(tsk) do {} while (0) +#define get_cpu_vector_context() do {} while (0) +#define put_cpu_vector_context() do {} while (0) +#define riscv_v_vstate_set_restore(task, regs) do {} while (0) #endif /* CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 86988a0aa0fc82627494846bee4f752e3f35eec3..0c6993527249d4bd03004e789364a1e6d415fe43 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -81,6 +81,12 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_EXT_ZAAMO (1ULL << 56) #define RISCV_HWPROBE_EXT_ZALRSC (1ULL << 57) #define RISCV_HWPROBE_EXT_ZABHA (1ULL << 58) +#define RISCV_HWPROBE_EXT_ZALASR (1ULL << 59) +#define RISCV_HWPROBE_EXT_ZICBOP (1ULL << 60) +#define RISCV_HWPROBE_EXT_ZILSD (1ULL << 61) +#define RISCV_HWPROBE_EXT_ZCLSD (1ULL << 62) +#define RISCV_HWPROBE_EXT_ZICFILP (1ULL << 63) + #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) @@ -89,6 +95,27 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_MISALIGNED_UNSUPPORTED (4 << 0) #define RISCV_HWPROBE_MISALIGNED_MASK (7 << 0) #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE 6 +#define RISCV_HWPROBE_KEY_HIGHEST_VIRT_ADDRESS 7 +#define RISCV_HWPROBE_KEY_TIME_CSR_FREQ 8 +#define RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF 9 +#define RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN 0 +#define RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED 1 +#define RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW 2 +#define RISCV_HWPROBE_MISALIGNED_SCALAR_FAST 3 +#define RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED 4 +#define RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF 10 +#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN 0 +#define RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW 2 +#define RISCV_HWPROBE_MISALIGNED_VECTOR_FAST 3 +#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4 +#define RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 11 +#define RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE 12 +#define RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 13 +#define RISCV_HWPROBE_KEY_VENDOR_EXT_MIPS_0 14 +#define RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE 15 +#define RISCV_HWPROBE_KEY_IMA_EXT_1 16 +#define RISCV_HWPROBE_EXT_ZICFISS (1ULL << 0) + /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ /* Flags */ diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h index a38268b19c3d3d5b3e837634e05a7774fc4287d4..69045f31941d7f0dea8762d9638c99b818e9a99c 100644 --- a/arch/riscv/include/uapi/asm/ptrace.h +++ b/arch/riscv/include/uapi/asm/ptrace.h @@ -127,6 +127,40 @@ struct __riscv_v_regset_state { */ #define RISCV_MAX_VLENB (8192) +struct __sc_riscv_cfi_state { + unsigned long ss_ptr; /* shadow stack pointer */ +}; + +#define PTRACE_CFI_LP_EN_BIT 0 +#define PTRACE_CFI_LP_LOCK_BIT 1 +#define PTRACE_CFI_ELP_BIT 2 +#define PTRACE_CFI_SS_EN_BIT 3 +#define PTRACE_CFI_SS_LOCK_BIT 4 +#define PTRACE_CFI_SS_PTR_BIT 5 + +#define PTRACE_CFI_LP_EN_STATE BIT(PTRACE_CFI_LP_EN_BIT) +#define PTRACE_CFI_LP_LOCK_STATE BIT(PTRACE_CFI_LP_LOCK_BIT) +#define PTRACE_CFI_ELP_STATE BIT(PTRACE_CFI_ELP_BIT) +#define PTRACE_CFI_SS_EN_STATE BIT(PTRACE_CFI_SS_EN_BIT) +#define PTRACE_CFI_SS_LOCK_STATE BIT(PTRACE_CFI_SS_LOCK_BIT) +#define PTRACE_CFI_SS_PTR_STATE BIT(PTRACE_CFI_SS_PTR_BIT) + +#define PRACE_CFI_STATE_INVALID_MASK ~(PTRACE_CFI_LP_EN_STATE | \ + PTRACE_CFI_LP_LOCK_STATE | \ + PTRACE_CFI_ELP_STATE | \ + PTRACE_CFI_SS_EN_STATE | \ + PTRACE_CFI_SS_LOCK_STATE | \ + PTRACE_CFI_SS_PTR_STATE) + +struct __cfi_status { + __u64 cfi_state; +}; + +struct user_cfi_state { + struct __cfi_status cfi_status; + __u64 shstk_ptr; +}; + #endif /* __ASSEMBLY__ */ #endif /* _UAPI_ASM_RISCV_PTRACE_H */ diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h index cd4f175dc8376327dd3efa6e5d6d3ccf00c9e945..f37e4beffe036af7faf5fead7f993bf4aa4ca2fd 100644 --- a/arch/riscv/include/uapi/asm/sigcontext.h +++ b/arch/riscv/include/uapi/asm/sigcontext.h @@ -10,6 +10,7 @@ /* The Magic number for signal context frame header. */ #define RISCV_V_MAGIC 0x53465457 +#define RISCV_ZICFISS_MAGIC 0x9487 #define END_MAGIC 0x0 /* The size of END signal context header. */ diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 6f45a3890eb58819f74978e48f87e2f83d8d1a34..0b1e3c78818e016a88bdb3ab5f0686cb1f30ac07 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -62,6 +62,7 @@ obj-y += patch.o obj-y += probes/ obj-y += tests/ obj-$(CONFIG_MMU) += vdso.o vdso/ +obj-$(CONFIG_RISCV_USER_CFI) += vdso_cfi/ obj-$(CONFIG_RISCV_M_MODE) += traps_misaligned.o obj-$(CONFIG_FPU) += fpu.o @@ -109,3 +110,6 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ obj-$(CONFIG_64BIT) += pi/ obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o + +obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += bugs.o +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index bfa99b5ac897b1afec57689c689f5457271cb620..50a4da2ddcd3850581b36d27190fc925850889c7 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -47,6 +47,10 @@ void asm_offsets(void) #endif OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu); +#ifdef CONFIG_RISCV_USER_CFI + OFFSET(TASK_TI_CFI_STATE, task_struct, thread_info.user_cfi_state); + OFFSET(TASK_TI_USER_SSP, task_struct, thread_info.user_cfi_state.user_shdw_stk); +#endif OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); OFFSET(TASK_THREAD_F2, task_struct, thread.fstate.f[2]); @@ -500,4 +504,10 @@ void asm_offsets(void) #define ASM_MAX_CPUS NR_CPUS DEFINE(ASM_NR_CPUS, ASM_MAX_CPUS); #endif +#ifdef CONFIG_RISCV_SBI + DEFINE(SBI_EXT_FWFT, SBI_EXT_FWFT); + DEFINE(SBI_EXT_FWFT_SET, SBI_EXT_FWFT_SET); + DEFINE(SBI_FWFT_SHADOW_STACK, SBI_FWFT_SHADOW_STACK); + DEFINE(SBI_FWFT_SET_FLAG_LOCK, SBI_FWFT_SET_FLAG_LOCK); +#endif } diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 6c4294ea5a7b61d5f3d781a7572cccafc93c22e4..9d590cff762b19be041c27727d496f2d47d28905 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -23,6 +23,7 @@ #include #include #include +#include #include "copy-unaligned.h" @@ -227,6 +228,26 @@ static int riscv_ext_svadu_validate(const struct riscv_isa_ext_data *data, return 0; } +static int riscv_cfilp_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (!IS_ENABLED(CONFIG_RISCV_USER_CFI) || + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI)) + return -EINVAL; + + return 0; +} + +static int riscv_cfiss_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (!IS_ENABLED(CONFIG_RISCV_USER_CFI) || + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI)) + return -EINVAL; + + return 0; +} + static const unsigned int riscv_a_exts[] = { RISCV_ISA_EXT_ZAAMO, RISCV_ISA_EXT_ZALRSC, @@ -420,6 +441,10 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicboz, RISCV_ISA_EXT_ZICBOZ, riscv_xlinuxenvcfg_exts, riscv_ext_zicboz_validate), __RISCV_ISA_EXT_DATA(ziccrse, RISCV_ISA_EXT_ZICCRSE), + __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicfilp, RISCV_ISA_EXT_ZICFILP, riscv_xlinuxenvcfg_exts, + riscv_cfilp_validate), + __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicfiss, RISCV_ISA_EXT_ZICFISS, riscv_xlinuxenvcfg_exts, + riscv_cfiss_validate), __RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), __RISCV_ISA_EXT_DATA(zicond, RISCV_ISA_EXT_ZICOND), __RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 7167ee40af4cc2a6f7ac9254482b9f4bff4b46b8..ac6d59e650c959751e9da6257426a772e8c02135 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -90,6 +90,35 @@ _new_vmalloc_restore_context_a0: REG_L a0, TASK_TI_A0(tp) .endm +/* + * If previous mode was U, capture shadow stack pointer and save it away + * Zero CSR_SSP at the same time for sanitization. + */ +.macro save_userssp tmp, status + ALTERNATIVE("nops(4)", + __stringify( \ + andi \tmp, \status, SR_SPP; \ + bnez \tmp, skip_ssp_save; \ + csrrw \tmp, CSR_SSP, x0; \ + REG_S \tmp, TASK_TI_USER_SSP(tp); \ + skip_ssp_save:), + 0, + RISCV_ISA_EXT_ZICFISS, + CONFIG_RISCV_USER_CFI) +.endm + +.macro restore_userssp tmp, status + ALTERNATIVE("nops(4)", + __stringify( \ + andi \tmp, \status, SR_SPP; \ + bnez \tmp, skip_ssp_restore; \ + REG_L \tmp, TASK_TI_USER_SSP(tp); \ + csrw CSR_SSP, \tmp; \ + skip_ssp_restore:), + 0, + RISCV_ISA_EXT_ZICFISS, + CONFIG_RISCV_USER_CFI) +.endm SYM_CODE_START(handle_exception) /* @@ -143,9 +172,14 @@ SYM_CODE_START(handle_exception) * or vector in kernel space. */ li t0, SR_SUM | SR_FS_VS +#ifdef CONFIG_64BIT + li t1, SR_ELP + or t0, t0, t1 +#endif REG_L s0, TASK_TI_USER_SP(tp) csrrc s1, CSR_STATUS, t0 + save_userssp s2, s1 csrr s2, CSR_EPC csrr s3, CSR_TVAL csrr s4, CSR_CAUSE @@ -230,6 +264,7 @@ SYM_CODE_START_NOALIGN(ret_from_exception) call riscv_v_context_nesting_end #endif REG_L a0, PT_STATUS(sp) + restore_userssp s3, a0 /* * The current load reservation is effectively part of the processor's * state, in the sense that load reservations cannot be shared between @@ -392,6 +427,9 @@ SYM_DATA_START_LOCAL(excp_vect_table) RISCV_PTR do_page_fault /* load page fault */ RISCV_PTR do_trap_unknown RISCV_PTR do_page_fault /* store page fault */ + RISCV_PTR do_trap_unknown /* cause=16 */ + RISCV_PTR do_trap_unknown /* cause=17 */ + RISCV_PTR do_trap_software_check /* cause=18 is sw check exception */ SYM_DATA_END_LABEL(excp_vect_table, SYM_L_LOCAL, excp_vect_table_end) #ifndef CONFIG_MMU diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 756bf0b68c3c1f49ffe72841c98db8df25240913..d74031e0ddc4d3b321c19f77ba2968a70162d91a 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -15,6 +15,7 @@ #include #include #include +#include #include "efi-header.S" __HEAD @@ -176,6 +177,19 @@ secondary_start_sbi: call relocate_enable_mmu #endif call .Lsetup_trap_vector +#if defined(CONFIG_RISCV_SBI) && defined(CONFIG_RISCV_USER_CFI) + li a7, SBI_EXT_FWFT + li a6, SBI_EXT_FWFT_SET + li a0, SBI_FWFT_SHADOW_STACK + li a1, 1 /* enable supervisor to access shadow stack access */ + li a2, SBI_FWFT_SET_FLAG_LOCK + ecall + beqz a0, 1f + la a1, riscv_nousercfi + li a0, CMDLINE_DISABLE_RISCV_USERCFI_BCFI + REG_S a0, (a1) +1: +#endif call smp_callin #endif /* CONFIG_SMP */ @@ -335,6 +349,19 @@ SYM_CODE_START(_start_kernel) la tp, init_task la sp, init_thread_union + THREAD_SIZE addi sp, sp, -PT_SIZE_ON_STACK +#if defined(CONFIG_RISCV_SBI) && defined(CONFIG_RISCV_USER_CFI) + li a7, SBI_EXT_FWFT + li a6, SBI_EXT_FWFT_SET + li a0, SBI_FWFT_SHADOW_STACK + li a1, 1 /* enable supervisor to access shadow stack access */ + li a2, SBI_FWFT_SET_FLAG_LOCK + ecall + beqz a0, 1f + la a1, riscv_nousercfi + li a0, CMDLINE_DISABLE_RISCV_USERCFI_BCFI + REG_S a0, (a1) +1: +#endif #ifdef CONFIG_KASAN call kasan_early_init diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 4b76018b02a285721a4a913f39d0325638aa35ef..3890472bbedb95352ace02a2b50d4da817502643 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -26,6 +26,9 @@ #include #include #include +#include +#include +#include #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK) #include @@ -69,8 +72,8 @@ void __show_regs(struct pt_regs *regs) regs->s8, regs->s9, regs->s10); pr_cont(" s11: " REG_FMT " t3 : " REG_FMT " t4 : " REG_FMT "\n", regs->s11, regs->t3, regs->t4); - pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT "\n", - regs->t5, regs->t6); + pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT " ssp : " REG_FMT "\n", + regs->t5, regs->t6, get_active_shstk(current)); pr_cont("status: " REG_FMT " badaddr: " REG_FMT " cause: " REG_FMT "\n", regs->status, regs->badaddr, regs->cause); @@ -125,6 +128,19 @@ void start_thread(struct pt_regs *regs, unsigned long pc, regs->epc = pc; regs->sp = sp; + /* + * clear shadow stack state on exec. + * libc will set it later via prctl. + */ + set_shstk_status(current, false); + set_shstk_base(current, 0, 0); + set_active_shstk(current, 0); + /* + * disable indirect branch tracking on exec. + * libc will enable it later via prctl. + */ + set_indir_lp_status(current, false); + #ifdef CONFIG_64BIT regs->status &= ~SR_UXL; @@ -184,6 +200,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) unsigned long clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; + unsigned long ssp = 0; struct pt_regs *childregs = task_pt_regs(p); /* Ensure all threads in this mm have the same pointer masking mode. */ @@ -202,11 +219,19 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { + /* allocate new shadow stack if needed. In case of CLONE_VM we have to */ + ssp = shstk_alloc_thread_stack(p, args); + if (IS_ERR_VALUE(ssp)) + return PTR_ERR((void *)ssp); + *childregs = *(current_pt_regs()); /* Turn off status.VS */ riscv_v_vstate_off(childregs); if (usp) /* User fork */ childregs->sp = usp; + /* if needed, set new ssp */ + if (ssp) + set_active_shstk(p, ssp); if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 6e3caafef03620225d2fa1b4d85e77507615c7ee..baeb2093529591a1e812ac8fa5fd8bf0a59d8865 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -19,6 +19,7 @@ #include #include #include +#include enum riscv_regset { REGSET_X, @@ -31,6 +32,9 @@ enum riscv_regset { #ifdef CONFIG_RISCV_ISA_SUPM REGSET_TAGGED_ADDR_CTRL, #endif +#ifdef CONFIG_RISCV_USER_CFI + REGSET_CFI, +#endif }; static int riscv_gpr_get(struct task_struct *target, @@ -184,7 +188,88 @@ static int tagged_addr_ctrl_set(struct task_struct *target, } #endif -static const struct user_regset riscv_user_regset[] = { +#ifdef CONFIG_RISCV_USER_CFI +static int riscv_cfi_get(struct task_struct *target, + const struct user_regset *regset, + struct membuf to) +{ + struct user_cfi_state user_cfi; + struct pt_regs *regs; + + memset(&user_cfi, 0, sizeof(user_cfi)); + regs = task_pt_regs(target); + + if (is_indir_lp_enabled(target)) { + user_cfi.cfi_status.cfi_state |= PTRACE_CFI_LP_EN_STATE; + user_cfi.cfi_status.cfi_state |= is_indir_lp_locked(target) ? + PTRACE_CFI_LP_LOCK_STATE : 0; + user_cfi.cfi_status.cfi_state |= (regs->status & SR_ELP) ? + PTRACE_CFI_ELP_STATE : 0; + } + + if (is_shstk_enabled(target)) { + user_cfi.cfi_status.cfi_state |= (PTRACE_CFI_SS_EN_STATE | + PTRACE_CFI_SS_PTR_STATE); + user_cfi.cfi_status.cfi_state |= is_shstk_locked(target) ? + PTRACE_CFI_SS_LOCK_STATE : 0; + user_cfi.shstk_ptr = get_active_shstk(target); + } + + return membuf_write(&to, &user_cfi, sizeof(user_cfi)); +} + +/* + * Does it make sense to allow enable / disable of cfi via ptrace? + * We don't allow enable / disable / locking control via ptrace for now. + * Setting the shadow stack pointer is allowed. GDB might use it to unwind or + * some other fixup. Similarly gdb might want to suppress elp and may want + * to reset elp state. + */ +static int riscv_cfi_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + int ret; + struct user_cfi_state user_cfi; + struct pt_regs *regs; + + regs = task_pt_regs(target); + + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_cfi, 0, -1); + if (ret) + return ret; + + /* + * Not allowing enabling or locking shadow stack or landing pad + * There is no disabling of shadow stack or landing pad via ptrace + * rsvd field should be set to zero so that if those fields are needed in future + */ + if ((user_cfi.cfi_status.cfi_state & + (PTRACE_CFI_LP_EN_STATE | PTRACE_CFI_LP_LOCK_STATE | + PTRACE_CFI_SS_EN_STATE | PTRACE_CFI_SS_LOCK_STATE)) || + (user_cfi.cfi_status.cfi_state & PRACE_CFI_STATE_INVALID_MASK)) + return -EINVAL; + + /* If lpad is enabled on target and ptrace requests to set / clear elp, do that */ + if (is_indir_lp_enabled(target)) { + if (user_cfi.cfi_status.cfi_state & + PTRACE_CFI_ELP_STATE) /* set elp state */ + regs->status |= SR_ELP; + else + regs->status &= ~SR_ELP; /* clear elp state */ + } + + /* If shadow stack enabled on target, set new shadow stack pointer */ + if (is_shstk_enabled(target) && + (user_cfi.cfi_status.cfi_state & PTRACE_CFI_SS_PTR_STATE)) + set_active_shstk(target, user_cfi.shstk_ptr); + + return 0; +} +#endif + +static struct user_regset riscv_user_regset[] __ro_after_init = { [REGSET_X] = { .core_note_type = NT_PRSTATUS, .n = ELF_NGREG, @@ -224,6 +309,16 @@ static const struct user_regset riscv_user_regset[] = { .set = tagged_addr_ctrl_set, }, #endif +#ifdef CONFIG_RISCV_USER_CFI + [REGSET_CFI] = { + .core_note_type = NT_RISCV_USER_CFI, + .align = sizeof(__u64), + .n = sizeof(struct user_cfi_state) / sizeof(__u64), + .size = sizeof(__u64), + .regset_get = riscv_cfi_get, + .set = riscv_cfi_set, + }, +#endif }; static const struct user_regset_view riscv_user_native_view = { diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 0e3bdd520617ea52e06a2e9a68e4b24593d83d1b..2c21fa849851a861833e4441171c54626c2449cd 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -22,11 +22,13 @@ #include #include #include +#include unsigned long signal_minsigstksz __ro_after_init; extern u32 __user_rt_sigreturn[2]; static size_t riscv_v_sc_size __ro_after_init; +static size_t riscv_zicfiss_sc_size __ro_after_init; #define DEBUG_SIG 0 @@ -68,18 +70,19 @@ static long save_fp_state(struct pt_regs *regs, #define restore_fp_state(task, regs) (0) #endif -#ifdef CONFIG_RISCV_ISA_V - -static long save_v_state(struct pt_regs *regs, void __user **sc_vec) +static long save_v_state(struct pt_regs *regs, void __user *sc_vec) { - struct __riscv_ctx_hdr __user *hdr; struct __sc_riscv_v_state __user *state; void __user *datap; long err; - hdr = *sc_vec; - /* Place state to the user's signal context space after the hdr */ - state = (struct __sc_riscv_v_state __user *)(hdr + 1); + if (!IS_ENABLED(CONFIG_RISCV_ISA_V) || + !(has_vector() && + riscv_v_vstate_query(regs))) + return 0; + + /* Place state to the user's signal context space */ + state = (struct __sc_riscv_v_state __user *)sc_vec; /* Point datap right after the end of __sc_riscv_v_state */ datap = state + 1; @@ -97,15 +100,11 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec) err |= __put_user(datap, &state->v_state.datap); /* Copy the whole vector content to user space datap. */ err |= __copy_to_user(datap, current->thread.vstate.datap, riscv_v_vsize); - /* Copy magic to the user space after saving all vector conetext */ - err |= __put_user(RISCV_V_MAGIC, &hdr->magic); - err |= __put_user(riscv_v_sc_size, &hdr->size); if (unlikely(err)) - return err; + return -EFAULT; - /* Only progress the sv_vec if everything has done successfully */ - *sc_vec += riscv_v_sc_size; - return 0; + /* Only return the size if everything has done successfully */ + return riscv_v_sc_size; } /* @@ -142,10 +141,80 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec) */ return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize); } -#else -#define save_v_state(task, regs) (0) -#define __restore_v_state(task, regs) (0) -#endif + +static long save_cfiss_state(struct pt_regs *regs, void __user *sc_cfi) +{ + struct __sc_riscv_cfi_state __user *state = sc_cfi; + unsigned long ss_ptr = 0; + long err = 0; + + if (!is_shstk_enabled(current)) + return 0; + + /* + * Save a pointer to the shadow stack itself on shadow stack as a form of token. + * A token on the shadow stack gives the following properties: + * - Safe save and restore for shadow stack switching. Any save of a shadow stack + * must have saved a token on the shadow stack. Similarly any restore of shadow + * stack must check the token before restore. Since writing to the shadow stack with + * address of the shadow stack itself is not easily allowed, a restore without a save + * is quite difficult for an attacker to perform. + * - A natural break. A token in shadow stack provides a natural break in shadow stack + * So a single linear range can be bucketed into different shadow stack segments. Any + * sspopchk will detect the condition and fault to kernel as a sw check exception. + */ + err |= save_user_shstk(current, &ss_ptr); + err |= __put_user(ss_ptr, &state->ss_ptr); + if (unlikely(err)) + return -EFAULT; + + return riscv_zicfiss_sc_size; +} + +static long __restore_cfiss_state(struct pt_regs *regs, void __user *sc_cfi) +{ + struct __sc_riscv_cfi_state __user *state = sc_cfi; + unsigned long ss_ptr = 0; + long err; + + /* + * Restore shadow stack as a form of token stored on the shadow stack itself as a safe + * way to restore. + * A token on the shadow stack gives the following properties: + * - Safe save and restore for shadow stack switching. Any save of shadow stack + * must have saved a token on shadow stack. Similarly any restore of shadow + * stack must check the token before restore. Since writing to a shadow stack with + * the address of shadow stack itself is not easily allowed, a restore without a save + * is quite difficult for an attacker to perform. + * - A natural break. A token in the shadow stack provides a natural break in shadow stack + * So a single linear range can be bucketed into different shadow stack segments. + * sspopchk will detect the condition and fault to kernel as a sw check exception. + */ + err = __copy_from_user(&ss_ptr, &state->ss_ptr, sizeof(unsigned long)); + + if (unlikely(err)) + return err; + + return restore_user_shstk(current, ss_ptr); +} + +struct arch_ext_priv { + __u32 magic; + long (*save)(struct pt_regs *regs, void __user *sc_vec); +}; + +static struct arch_ext_priv arch_ext_list[] = { + { + .magic = RISCV_V_MAGIC, + .save = &save_v_state, + }, + { + .magic = RISCV_ZICFISS_MAGIC, + .save = &save_cfiss_state, + }, +}; + +static const size_t nr_arch_exts = ARRAY_SIZE(arch_ext_list); static long restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc) @@ -195,6 +264,12 @@ static long restore_sigcontext(struct pt_regs *regs, err = __restore_v_state(regs, sc_ext_ptr); break; + case RISCV_ZICFISS_MAGIC: + if (!is_shstk_enabled(current) || size != riscv_zicfiss_sc_size) + return -EINVAL; + + err = __restore_cfiss_state(regs, sc_ext_ptr); + break; default: return -EINVAL; } @@ -216,6 +291,16 @@ static size_t get_rt_frame_size(bool cal_all) total_context_size += riscv_v_sc_size; } + if (is_shstk_enabled(current)) + total_context_size += riscv_zicfiss_sc_size; + + /* + * Preserved a __riscv_ctx_hdr for END signal context header if an + * extension uses __riscv_extra_ext_header + */ + if (total_context_size) + total_context_size += sizeof(struct __riscv_ctx_hdr); + frame_size += total_context_size; frame_size = round_up(frame_size, 16); @@ -270,7 +355,8 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, { struct sigcontext __user *sc = &frame->uc.uc_mcontext; struct __riscv_ctx_hdr __user *sc_ext_ptr = &sc->sc_extdesc.hdr; - long err; + struct arch_ext_priv *arch_ext; + long err, i, ext_size; /* sc_regs is structured the same as the start of pt_regs */ err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs)); @@ -278,8 +364,20 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, if (has_fpu()) err |= save_fp_state(regs, &sc->sc_fpregs); /* Save the vector state. */ - if (has_vector() && riscv_v_vstate_query(regs)) - err |= save_v_state(regs, (void __user **)&sc_ext_ptr); + for (i = 0; i < nr_arch_exts; i++) { + arch_ext = &arch_ext_list[i]; + if (!arch_ext->save) + continue; + + ext_size = arch_ext->save(regs, sc_ext_ptr + 1); + if (ext_size <= 0) { + err |= ext_size; + } else { + err |= __put_user(arch_ext->magic, &sc_ext_ptr->magic); + err |= __put_user(ext_size, &sc_ext_ptr->size); + sc_ext_ptr = (void *)sc_ext_ptr + ext_size; + } + } /* Write zero to fp-reserved space and check it on restore_sigcontext */ err |= __put_user(0, &sc->sc_extdesc.reserved); /* And put END __riscv_ctx_hdr at the end. */ @@ -339,6 +437,11 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, #ifdef CONFIG_MMU regs->ra = (unsigned long)VDSO_SYMBOL( current->mm->context.vdso, rt_sigreturn); + + /* if bcfi is enabled x1 (ra) and x5 (t0) must match. not sure if we need this? */ + if (is_shstk_enabled(current)) + regs->t0 = regs->ra; + #else /* * For the nommu case we don't have a VDSO. Instead we push two @@ -467,6 +570,9 @@ void __init init_rt_signal_env(void) { riscv_v_sc_size = sizeof(struct __riscv_ctx_hdr) + sizeof(struct __sc_riscv_v_state) + riscv_v_vsize; + + riscv_zicfiss_sc_size = sizeof(struct __riscv_ctx_hdr) + + sizeof(struct __sc_riscv_cfi_state); /* * Determine the stack space required for guaranteed signal delivery. * The signal_minsigstksz will be populated into the AT_MINSIGSTKSZ entry diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 7a7248a906a41750f74f5cd0d7e262dd6a4ba1ba..7cfe957381ccb36284703ea22d30350a03623206 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -121,6 +121,7 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, EXT_KEY(ZAAMO); EXT_KEY(ZALRSC); EXT_KEY(ZABHA); + EXT_KEY(ZICFILP); /* * All the following extensions must depend on the kernel @@ -166,6 +167,43 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, pair->value &= ~missing; } +static void hwprobe_isa_ext1(struct riscv_hwprobe *pair, + const struct cpumask *cpus) +{ + int cpu; + u64 missing = 0; + + pair->value = 0; + + /* + * Loop through and record extensions that 1) anyone has, and 2) anyone + * doesn't have. + */ + for_each_cpu(cpu, cpus) { + struct riscv_isainfo *isainfo = &hart_isa[cpu]; + +#define EXT_KEY(ext) \ + do { \ + if (__riscv_isa_extension_available(isainfo->isa, RISCV_ISA_EXT_##ext)) \ + pair->value |= RISCV_HWPROBE_EXT_##ext; \ + else \ + missing |= RISCV_HWPROBE_EXT_##ext; \ + } while (false) + + /* + * Only use EXT_KEY() for extensions which can be + * exposed to userspace, regardless of the kernel's + * configuration, as no other checks, besides presence + * in the hart_isa bitmap, are made. + */ + EXT_KEY(ZICFISS); +#undef EXT_KEY + } + + /* Now turn off reporting features if any CPU is missing it. */ + pair->value &= ~missing; +} + static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) { struct riscv_hwprobe pair; diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c index f1c1416a9f1e51d320e57a92e75fb2219695e00a..061dab55d491b548a945845ca6fa2cf47e9f23b6 100644 --- a/arch/riscv/kernel/sys_riscv.c +++ b/arch/riscv/kernel/sys_riscv.c @@ -17,6 +17,15 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len, if (unlikely(offset & (~PAGE_MASK >> page_shift_offset))) return -EINVAL; + /* + * If PROT_WRITE is specified then extend that to PROT_READ + * protection_map[VM_WRITE] is now going to select shadow stack encodings. + * So specifying PROT_WRITE actually should select protection_map [VM_WRITE | VM_READ] + * If user wants to create shadow stack then they should use `map_shadow_stack` syscall. + */ + if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ))) + prot |= PROT_READ; + return ksys_mmap_pgoff(addr, len, prot, flags, fd, offset >> (PAGE_SHIFT - page_shift_offset)); } diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 53c7de4878c227fb0a77177b32470ba3363303fe..3404320a9a711088e4255568febbac3f941464c2 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -332,6 +332,60 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs) } +#define CFI_TVAL_FCFI_CODE 2 +#define CFI_TVAL_BCFI_CODE 3 +/* handle cfi violations */ +bool handle_user_cfi_violation(struct pt_regs *regs) +{ + unsigned long tval = csr_read(CSR_TVAL); + bool is_fcfi = (tval == CFI_TVAL_FCFI_CODE && cpu_supports_indirect_br_lp_instr()); + bool is_bcfi = (tval == CFI_TVAL_BCFI_CODE && cpu_supports_shadow_stack()); + + /* + * Handle uprobe event first. The probe point can be a valid target + * of indirect jumps or calls, in this case, forward cfi violation + * will be triggered instead of breakpoint exception. Clear ELP flag + * on sstatus image as well to avoid recurring fault. + */ + if (is_fcfi && probe_breakpoint_handler(regs)) { + regs->status &= ~SR_ELP; + return true; + } + + if (is_fcfi || is_bcfi) { + do_trap_error(regs, SIGSEGV, SEGV_CPERR, regs->epc, + "Oops - control flow violation"); + return true; + } + + return false; +} + +/* + * software check exception is defined with risc-v cfi spec. Software check + * exception is raised when: + * a) An indirect branch doesn't land on 4 byte aligned PC or `lpad` + * instruction or `label` value programmed in `lpad` instr doesn't + * match with value setup in `x7`. reported code in `xtval` is 2. + * b) `sspopchk` instruction finds a mismatch between top of shadow stack (ssp) + * and x1/x5. reported code in `xtval` is 3. + */ +asmlinkage __visible __trap_section void do_trap_software_check(struct pt_regs *regs) +{ + if (user_mode(regs)) { + irqentry_enter_from_user_mode(regs); + + /* not a cfi violation, then merge into flow of unknown trap handler */ + if (!handle_user_cfi_violation(regs)) + do_trap_unknown(regs); + + irqentry_exit_to_user_mode(regs); + } else { + /* sw check exception coming from kernel is a bug in kernel */ + die(regs, "Kernel BUG"); + } +} + #ifdef CONFIG_MMU asmlinkage __visible noinstr void do_page_fault(struct pt_regs *regs) { diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c new file mode 100644 index 0000000000000000000000000000000000000000..1adba746f164e52d657d28c7a24a023f7e5c74d3 --- /dev/null +++ b/arch/riscv/kernel/usercfi.c @@ -0,0 +1,542 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Rivos, Inc. + * Deepak Gupta + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +unsigned long riscv_nousercfi __read_mostly; + +#define SHSTK_ENTRY_SIZE sizeof(void *) + +bool is_shstk_enabled(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ubcfi_en; +} + +bool is_shstk_allocated(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.shdw_stk_base; +} + +bool is_shstk_locked(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ubcfi_locked; +} + +void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size) +{ + task->thread_info.user_cfi_state.shdw_stk_base = shstk_addr; + task->thread_info.user_cfi_state.shdw_stk_size = size; +} + +unsigned long get_shstk_base(struct task_struct *task, unsigned long *size) +{ + if (size) + *size = task->thread_info.user_cfi_state.shdw_stk_size; + return task->thread_info.user_cfi_state.shdw_stk_base; +} + +void set_active_shstk(struct task_struct *task, unsigned long shstk_addr) +{ + task->thread_info.user_cfi_state.user_shdw_stk = shstk_addr; +} + +unsigned long get_active_shstk(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.user_shdw_stk; +} + +void set_shstk_status(struct task_struct *task, bool enable) +{ + if (!is_user_shstk_enabled()) + return; + + task->thread_info.user_cfi_state.ubcfi_en = enable ? 1 : 0; + + if (enable) + task->thread.envcfg |= ENVCFG_SSE; + else + task->thread.envcfg &= ~ENVCFG_SSE; + + csr_write(CSR_ENVCFG, task->thread.envcfg); +} + +void set_shstk_lock(struct task_struct *task) +{ + task->thread_info.user_cfi_state.ubcfi_locked = 1; +} + +bool is_indir_lp_enabled(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ufcfi_en; +} + +bool is_indir_lp_locked(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ufcfi_locked; +} + +void set_indir_lp_status(struct task_struct *task, bool enable) +{ + if (!is_user_lpad_enabled()) + return; + + task->thread_info.user_cfi_state.ufcfi_en = enable ? 1 : 0; + + if (enable) + task->thread.envcfg |= ENVCFG_LPE; + else + task->thread.envcfg &= ~ENVCFG_LPE; + + csr_write(CSR_ENVCFG, task->thread.envcfg); +} + +void set_indir_lp_lock(struct task_struct *task) +{ + task->thread_info.user_cfi_state.ufcfi_locked = 1; +} +/* + * If size is 0, then to be compatible with regular stack we want it to be as big as + * regular stack. Else PAGE_ALIGN it and return back + */ +static unsigned long calc_shstk_size(unsigned long size) +{ + if (size) + return PAGE_ALIGN(size); + + return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); +} + +/* + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow + * stack. + */ +static noinline unsigned long amo_user_shstk(unsigned long __user *addr, unsigned long val) +{ + /* + * Never expect -1 on shadow stack. Expect return addresses and zero + */ + unsigned long swap = -1; + + __enable_user_access(); + asm goto(".option push\n" + ".option arch, +zicfiss\n" + "1: ssamoswap.d %[swap], %[val], %[addr]\n" + _ASM_EXTABLE(1b, %l[fault]) + ".option pop\n" + : [swap] "=r" (swap), [addr] "+A" (*(__force unsigned long *)addr) + : [val] "r" (val) + : "memory" + : fault + ); + __disable_user_access(); + return swap; +fault: + __disable_user_access(); + return -1; +} + +/* + * Create a restore token on the shadow stack. A token is always XLEN wide + * and aligned to XLEN. + */ +static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) +{ + unsigned long addr; + + /* Token must be aligned */ + if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) + return -EINVAL; + + /* On RISC-V we're constructing token to be function of address itself */ + addr = ssp - SHSTK_ENTRY_SIZE; + + if (amo_user_shstk((unsigned long __user *)addr, (unsigned long)ssp) == -1) + return -EFAULT; + + if (token_addr) + *token_addr = addr; + + return 0; +} + +/* + * Save user shadow stack pointer on the shadow stack itself and return a pointer to saved location. + * Returns -EFAULT if unsuccessful. + */ +int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr) +{ + unsigned long ss_ptr = 0; + unsigned long token_loc = 0; + int ret = 0; + + if (!saved_shstk_ptr) + return -EINVAL; + + ss_ptr = get_active_shstk(tsk); + ret = create_rstor_token(ss_ptr, &token_loc); + + if (!ret) { + *saved_shstk_ptr = token_loc; + set_active_shstk(tsk, token_loc); + } + + return ret; +} + +/* + * Restores the user shadow stack pointer from the token on the shadow stack for task 'tsk'. + * Returns -EFAULT if unsuccessful. + */ +int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr) +{ + unsigned long token = 0; + + token = amo_user_shstk((unsigned long __user *)shstk_ptr, 0); + + if (token == -1) + return -EFAULT; + + /* invalid token, return EINVAL */ + if ((token - shstk_ptr) != SHSTK_ENTRY_SIZE) { + pr_info_ratelimited("%s[%d]: bad restore token in %s: pc=%p sp=%p, token=%p, shstk_ptr=%p\n", + tsk->comm, task_pid_nr(tsk), __func__, + (void *)(task_pt_regs(tsk)->epc), + (void *)(task_pt_regs(tsk)->sp), + (void *)token, (void *)shstk_ptr); + return -EINVAL; + } + + /* all checks passed, set active shstk and return success */ + set_active_shstk(tsk, token); + return 0; +} + +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, + unsigned long token_offset, bool set_tok) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + struct mm_struct *mm = current->mm; + unsigned long populate; + + if (addr) + flags |= MAP_FIXED_NOREPLACE; + + mmap_write_lock(mm); + addr = do_mmap(NULL, addr, size, PROT_READ, flags, + VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL); + mmap_write_unlock(mm); + + if (!set_tok || IS_ERR_VALUE(addr)) + goto out; + + if (create_rstor_token(addr + token_offset, NULL)) { + vm_munmap(addr, size); + return -EINVAL; + } + +out: + return addr; +} + +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) +{ + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; + unsigned long aligned_size = 0; + + if (!is_user_shstk_enabled()) + return -EOPNOTSUPP; + + /* Anything other than set token should result in invalid param */ + if (flags & ~SHADOW_STACK_SET_TOKEN) + return -EINVAL; + + /* + * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is an available + * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction + * itself. This provides static property on register programming and writes to CSR can't + * be unintentional from programmer's perspective. As long as programmer has guarded areas + * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since + * CSR_SSP is writable by user mode, it itself can setup a shadow stack token subsequent + * to allocation. Although in order to provide portablity with other architectures (because + * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token + * flag in flags and if provided in flags, will setup a token at the base. + */ + + /* If there isn't space for a token */ + if (set_tok && size < SHSTK_ENTRY_SIZE) + return -ENOSPC; + + if (addr && (addr & (PAGE_SIZE - 1))) + return -EINVAL; + + aligned_size = PAGE_ALIGN(size); + if (aligned_size < size) + return -EOVERFLOW; + + return allocate_shadow_stack(addr, aligned_size, size, set_tok); +} + +/* + * This gets called during clone/clone3/fork. And is needed to allocate a shadow stack for + * cases where CLONE_VM is specified and thus a different stack is specified by user. We + * thus need a separate shadow stack too. How a separate shadow stack is specified by + * user is still being debated. Once that's settled, remove this part of the comment. + * This function simply returns 0 if shadow stacks are not supported or if separate shadow + * stack allocation is not needed (like in case of !CLONE_VM) + */ +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args) +{ + unsigned long addr, size; + + /* If shadow stack is not supported, return 0 */ + if (!is_user_shstk_enabled()) + return 0; + + /* + * If shadow stack is not enabled on the new thread, skip any + * switch to a new shadow stack. + */ + if (!is_shstk_enabled(tsk)) + return 0; + + /* + * For CLONE_VFORK the child will share the parents shadow stack. + * Set base = 0 and size = 0, this is special means to track this state + * so the freeing logic run for child knows to leave it alone. + */ + if (args->flags & CLONE_VFORK) { + set_shstk_base(tsk, 0, 0); + return 0; + } + + /* + * For !CLONE_VM the child will use a copy of the parents shadow + * stack. + */ + if (!(args->flags & CLONE_VM)) + return 0; + + /* + * reaching here means, CLONE_VM was specified and thus a separate shadow + * stack is needed for new cloned thread. Note: below allocation is happening + * using current mm. + */ + size = calc_shstk_size(args->stack_size); + addr = allocate_shadow_stack(0, size, 0, false); + if (IS_ERR_VALUE(addr)) + return addr; + + set_shstk_base(tsk, addr, size); + + return addr + size; +} + +void shstk_release(struct task_struct *tsk) +{ + unsigned long base = 0, size = 0; + /* If shadow stack is not supported or not enabled, nothing to release */ + if (!is_user_shstk_enabled() || !is_shstk_enabled(tsk)) + return; + + /* + * When fork() with CLONE_VM fails, the child (tsk) already has a + * shadow stack allocated, and exit_thread() calls this function to + * free it. In this case the parent (current) and the child share + * the same mm struct. Move forward only when they're same. + */ + if (!tsk->mm || tsk->mm != current->mm) + return; + + /* + * We know shadow stack is enabled but if base is NULL, then + * this task is not managing its own shadow stack (CLONE_VFORK). So + * skip freeing it. + */ + base = get_shstk_base(tsk, &size); + if (!base) + return; + + vm_munmap(base, size); + set_shstk_base(tsk, 0, 0); +} + +int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status) +{ + unsigned long bcfi_status = 0; + + if (!is_user_shstk_enabled()) + return -EINVAL; + + /* this means shadow stack is enabled on the task */ + bcfi_status |= (is_shstk_enabled(t) ? PR_SHADOW_STACK_ENABLE : 0); + + return copy_to_user(status, &bcfi_status, sizeof(bcfi_status)) ? -EFAULT : 0; +} + +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + unsigned long size = 0, addr = 0; + bool enable_shstk = false; + + if (!is_user_shstk_enabled()) + return -EINVAL; + + /* Reject unknown flags */ + if (status & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK) + return -EINVAL; + + /* bcfi status is locked and further can't be modified by user */ + if (is_shstk_locked(t)) + return -EINVAL; + + enable_shstk = status & PR_SHADOW_STACK_ENABLE; + /* Request is to enable shadow stack and shadow stack is not enabled already */ + if (enable_shstk && !is_shstk_enabled(t)) { + /* shadow stack was allocated and enable request again + * no need to support such usecase and return EINVAL. + */ + if (is_shstk_allocated(t)) + return -EINVAL; + + size = calc_shstk_size(0); + addr = allocate_shadow_stack(0, size, 0, false); + if (IS_ERR_VALUE(addr)) + return -ENOMEM; + set_shstk_base(t, addr, size); + set_active_shstk(t, addr + size); + } + + /* + * If a request to disable shadow stack happens, let's go ahead and release it + * Although, if CLONE_VFORKed child did this, then in that case we will end up + * not releasing the shadow stack (because it might be needed in parent). Although + * we will disable it for VFORKed child. And if VFORKed child tries to enable again + * then in that case, it'll get entirely new shadow stack because following condition + * are true + * - shadow stack was not enabled for vforked child + * - shadow stack base was anyways pointing to 0 + * This shouldn't be a big issue because we want parent to have availability of shadow + * stack whenever VFORKed child releases resources via exit or exec but at the same + * time we want VFORKed child to break away and establish new shadow stack if it desires + * + */ + if (!enable_shstk) + shstk_release(t); + + set_shstk_status(t, enable_shstk); + return 0; +} + +int arch_lock_shadow_stack_status(struct task_struct *task, + unsigned long arg) +{ + /* If shtstk not supported or not enabled on task, nothing to lock here */ + if (!is_user_shstk_enabled() || + !is_shstk_enabled(task) || arg != 0) + return -EINVAL; + + set_shstk_lock(task); + + return 0; +} + +int arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status) +{ + unsigned long fcfi_status = 0; + + if (!is_user_lpad_enabled()) + return -EINVAL; + + /* indirect branch tracking is enabled on the task or not */ + fcfi_status |= (is_indir_lp_enabled(t) ? PR_INDIR_BR_LP_ENABLE : 0); + + return copy_to_user(status, &fcfi_status, sizeof(fcfi_status)) ? -EFAULT : 0; +} + +int arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status) +{ + bool enable_indir_lp = false; + + if (!is_user_lpad_enabled()) + return -EINVAL; + + /* indirect branch tracking is locked and further can't be modified by user */ + if (is_indir_lp_locked(t)) + return -EINVAL; + + /* Reject unknown flags */ + if (status & ~PR_INDIR_BR_LP_ENABLE) + return -EINVAL; + + enable_indir_lp = (status & PR_INDIR_BR_LP_ENABLE); + set_indir_lp_status(t, enable_indir_lp); + + return 0; +} + +int arch_lock_indir_br_lp_status(struct task_struct *task, + unsigned long arg) +{ + /* + * If indirect branch tracking is not supported or not enabled on task, + * nothing to lock here + */ + if (!is_user_lpad_enabled() || + !is_indir_lp_enabled(task) || arg != 0) + return -EINVAL; + + set_indir_lp_lock(task); + + return 0; +} + +bool is_user_shstk_enabled(void) +{ + return (cpu_supports_shadow_stack() && + !(riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI)); +} + +bool is_user_lpad_enabled(void) +{ + return (cpu_supports_indirect_br_lp_instr() && + !(riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI)); +} + +static int __init setup_global_riscv_enable(char *str) +{ + if (strcmp(str, "all") == 0) + riscv_nousercfi = CMDLINE_DISABLE_RISCV_USERCFI; + + if (strcmp(str, "fcfi") == 0) + riscv_nousercfi |= CMDLINE_DISABLE_RISCV_USERCFI_FCFI; + + if (strcmp(str, "bcfi") == 0) + riscv_nousercfi |= CMDLINE_DISABLE_RISCV_USERCFI_BCFI; + + if (riscv_nousercfi) + pr_info("RISC-V user CFI disabled via cmdline - shadow stack status : %s, landing pad status : %s\n", + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI) ? "disabled" : + "enabled", (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI) ? + "disabled" : "enabled"); + + return 1; +} + +__setup("riscv_nousercfi=", setup_global_riscv_enable); diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c index 2cf76218a5bd02c8f148318f78abd648dab276bf..e211ddcc02b9cbfc9c42c3b6e8d9aeac8bac3d04 100644 --- a/arch/riscv/kernel/vdso.c +++ b/arch/riscv/kernel/vdso.c @@ -203,6 +203,13 @@ static struct __vdso_info compat_vdso_info __ro_after_init = { static int __init vdso_init(void) { + /* Hart implements zimop, expose cfi compiled vdso */ + if (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZIMOP)) { + vdso_info.vdso_code_start = vdso_cfi_start; + vdso_info.vdso_code_end = vdso_cfi_end; + } + __vdso_init(&vdso_info); #ifdef CONFIG_COMPAT __vdso_init(&compat_vdso_info); diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index d58f32a2035b1fee68119f0c92478a49211524e9..375fa8ac962a8da40d87a0231c48ce81957b7d88 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -13,12 +13,30 @@ vdso-syms += flush_icache vdso-syms += hwprobe vdso-syms += sys_hwprobe +ifdef VDSO_CFI_BUILD +CFI_MARCH = _zicfilp_zicfiss +CFI_FULL = -fcf-protection=full +CFI_SUFFIX = -cfi +OFFSET_SUFFIX = _cfi +ccflags-y += -DVDSO_CFI=1 +asflags-y += -DVDSO_CFI=1 +endif + +ifdef VDSO_CFI_BUILD +CFI_MARCH = _zicfilp_zicfiss +CFI_FULL = -fcf-protection=full +endif + # Files to link into the vdso obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o ccflags-y := -fno-stack-protector ccflags-y += -DDISABLE_BRANCH_PROFILING ccflags-y += -fno-builtin +ccflags-y += $(KBUILD_BASE_ISA)$(CFI_MARCH) +ccflags-y += $(CFI_FULL) +asflags-y += $(KBUILD_BASE_ISA)$(CFI_MARCH) +asflags-y += $(CFI_FULL) ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y) @@ -27,13 +45,20 @@ endif CFLAGS_hwprobe.o += -fPIC # Build rules -targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds +vdso_offsets := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),)-offsets.h +vdso_o := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).o +vdso_so := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).so +vdso_so_dbg := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).so.dbg +vdso_lds := vdso.lds + +targets := $(obj-vdso) $(vdso_so) $(vdso_so_dbg) $(vdso_lds) + obj-vdso := $(addprefix $(obj)/, $(obj-vdso)) -obj-y += vdso.o -CPPFLAGS_vdso.lds += -P -C -U$(ARCH) +obj-y += vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).o +CPPFLAGS_$(vdso_lds) += -P -C -U$(ARCH) ifneq ($(filter vgettimeofday, $(vdso-syms)),) -CPPFLAGS_vdso.lds += -DHAS_VGETTIMEOFDAY +CPPFLAGS_$(vdso_lds) += -DHAS_VGETTIMEOFDAY endif # Disable -pg to prevent insert call site @@ -46,12 +71,12 @@ KASAN_SANITIZE := n UBSAN_SANITIZE := n # Force dependency -$(obj)/vdso.o: $(obj)/vdso.so +$(obj)/$(vdso_o): $(obj)/$(vdso_so) # link rule for the .so file, .lds has to be first -$(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE +$(obj)/$(vdso_so_dbg): $(obj)/$(vdso_lds) $(obj-vdso) FORCE $(call if_changed,vdsold) -LDFLAGS_vdso.so.dbg = -shared -S -soname=linux-vdso.so.1 \ +LDFLAGS_$(vdso_so_dbg) = -shared -S -soname=linux-vdso.so.1 \ --build-id=sha1 --hash-style=both --eh-frame-hdr # strip rule for the .so file @@ -62,15 +87,15 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE # Generate VDSO offsets using helper script gen-vdsosym := $(srctree)/$(src)/gen_vdso_offsets.sh quiet_cmd_vdsosym = VDSOSYM $@ - cmd_vdsosym = $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ + cmd_vdsosym = $(NM) $< | $(gen-vdsosym) $(OFFSET_SUFFIX) | LC_ALL=C sort > $@ -include/generated/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE +include/generated/$(vdso_offsets): $(obj)/$(vdso_so_dbg) FORCE $(call if_changed,vdsosym) # actual build commands # The DSO images are built using a special linker script # Make sure only to export the intended __vdso_xxx symbol offsets. quiet_cmd_vdsold = VDSOLD $@ - cmd_vdsold = $(LD) $(ld_flags) -T $(filter-out FORCE,$^) -o $@.tmp && \ + cmd_vdsold = $(LD) $(CFI_FULL) $(ld_flags) -T $(filter-out FORCE,$^) -o $@.tmp && \ $(OBJCOPY) $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ rm $@.tmp diff --git a/arch/riscv/kernel/vdso/flush_icache.S b/arch/riscv/kernel/vdso/flush_icache.S index 8f884227e8bca7fd3634217e71d4ee4ed122559a..e4c56970905e93c491142344ac0bd301e0eb1e5d 100644 --- a/arch/riscv/kernel/vdso/flush_icache.S +++ b/arch/riscv/kernel/vdso/flush_icache.S @@ -5,11 +5,13 @@ #include #include +#include .text /* int __vdso_flush_icache(void *start, void *end, unsigned long flags); */ SYM_FUNC_START(__vdso_flush_icache) .cfi_startproc + vdso_lpad #ifdef CONFIG_SMP li a7, __NR_riscv_flush_icache ecall @@ -20,3 +22,5 @@ SYM_FUNC_START(__vdso_flush_icache) ret .cfi_endproc SYM_FUNC_END(__vdso_flush_icache) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/gen_vdso_offsets.sh b/arch/riscv/kernel/vdso/gen_vdso_offsets.sh index c2e5613f3495109bc0faffca0a5782ade871a1f3..bd5d5afaaa147c085dc972f0a4a511265cd071ea 100755 --- a/arch/riscv/kernel/vdso/gen_vdso_offsets.sh +++ b/arch/riscv/kernel/vdso/gen_vdso_offsets.sh @@ -2,4 +2,6 @@ # SPDX-License-Identifier: GPL-2.0 LC_ALL=C -sed -n -e 's/^[0]\+\(0[0-9a-fA-F]*\) . \(__vdso_[a-zA-Z0-9_]*\)$/\#define \2_offset\t0x\1/p' +SUFFIX=${1:-""} +sed -n -e \ +'s/^[0]\+\(0[0-9a-fA-F]*\) . \(__vdso_[a-zA-Z0-9_]*\)$/\#define \2'$SUFFIX'_offset\t0x\1/p' diff --git a/arch/riscv/kernel/vdso/getcpu.S b/arch/riscv/kernel/vdso/getcpu.S index 9c1bd531907f2fefda1d0778191073ca6b70df1a..5c1ecc4e14654016481bec08802e06ee9bb511f1 100644 --- a/arch/riscv/kernel/vdso/getcpu.S +++ b/arch/riscv/kernel/vdso/getcpu.S @@ -5,14 +5,18 @@ #include #include +#include .text /* int __vdso_getcpu(unsigned *cpu, unsigned *node, void *unused); */ SYM_FUNC_START(__vdso_getcpu) .cfi_startproc + vdso_lpad /* For now, just do the syscall. */ li a7, __NR_getcpu ecall ret .cfi_endproc SYM_FUNC_END(__vdso_getcpu) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/note.S b/arch/riscv/kernel/vdso/note.S index 2a956c942211140ea137a7203142d54d07144078..3d92cc956b95b92b592c8c4f5ee07445031a7992 100644 --- a/arch/riscv/kernel/vdso/note.S +++ b/arch/riscv/kernel/vdso/note.S @@ -6,7 +6,10 @@ #include #include +#include ELFNOTE_START(Linux, 0, "a") .long LINUX_VERSION_CODE ELFNOTE_END + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/rt_sigreturn.S b/arch/riscv/kernel/vdso/rt_sigreturn.S index 3dc022aa8931ad3b3798f4cc492ce795dd7b7bf5..e82987dc37394b6463409d4c4ff0bb23c951843b 100644 --- a/arch/riscv/kernel/vdso/rt_sigreturn.S +++ b/arch/riscv/kernel/vdso/rt_sigreturn.S @@ -5,12 +5,16 @@ #include #include +#include .text SYM_FUNC_START(__vdso_rt_sigreturn) .cfi_startproc .cfi_signal_frame + vdso_lpad li a7, __NR_rt_sigreturn ecall .cfi_endproc SYM_FUNC_END(__vdso_rt_sigreturn) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/sys_hwprobe.S b/arch/riscv/kernel/vdso/sys_hwprobe.S index 77e57f8305216c466f51979c91899754b4a7b382..f1694451a60c0f86ff66cf54f65c1d9355806002 100644 --- a/arch/riscv/kernel/vdso/sys_hwprobe.S +++ b/arch/riscv/kernel/vdso/sys_hwprobe.S @@ -3,13 +3,17 @@ #include #include +#include .text SYM_FUNC_START(riscv_hwprobe) .cfi_startproc + vdso_lpad li a7, __NR_riscv_hwprobe ecall ret .cfi_endproc SYM_FUNC_END(riscv_hwprobe) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso_cfi/Makefile b/arch/riscv/kernel/vdso_cfi/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..585add540e08201e46f641ed1f9b2b62dc006356 --- /dev/null +++ b/arch/riscv/kernel/vdso_cfi/Makefile @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only +# RISC-V VDSO CFI Makefile +# This Makefile builds the VDSO with CFI support when CONFIG_RISCV_USER_CFI is enabled + +# setting VDSO_CFI_BUILD triggers build for vdso differently +VDSO_CFI_BUILD := 1 + +# Set the source directory to the main vdso directory +src := $(srctree)/arch/riscv/kernel/vdso + +# Copy all .S and .c files from vdso directory to vdso_cfi object build directory +vdso_c_sources := $(wildcard $(src)/*.c) +vdso_S_sources := $(wildcard $(src)/*.S) +vdso_c_objects := $(addprefix $(obj)/, $(notdir $(vdso_c_sources))) +vdso_S_objects := $(addprefix $(obj)/, $(notdir $(vdso_S_sources))) + +$(vdso_S_objects): $(obj)/%.S: $(src)/%.S + $(Q)cp $< $@ + +$(vdso_c_objects): $(obj)/%.c: $(src)/%.c + $(Q)cp $< $@ + +# Include the main VDSO Makefile which contains all the build rules and sources +# The VDSO_CFI_BUILD variable will be passed to it to enable CFI compilation +include $(src)/Makefile + +# Provide an explicit rule for vdso-cfi.o since vdso-cfi.S lives in vdso_cfi/ +# directory but $(src) has been redirected to the vdso directory. +$(obj)/vdso-cfi.o: $(srctree)/arch/riscv/kernel/vdso_cfi/vdso-cfi.S $(obj)/vdso-cfi.so FORCE + $(call if_changed_rule,as_o_S) diff --git a/arch/riscv/kernel/vdso_cfi/vdso-cfi.S b/arch/riscv/kernel/vdso_cfi/vdso-cfi.S new file mode 100644 index 0000000000000000000000000000000000000000..d426f6accb35210b1aa86070b80f3bbe1a0fbf86 --- /dev/null +++ b/arch/riscv/kernel/vdso_cfi/vdso-cfi.S @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright 2025 Rivos, Inc + */ + +#define vdso_start vdso_cfi_start +#define vdso_end vdso_cfi_end + +#define __VDSO_PATH "arch/riscv/kernel/vdso_cfi/vdso-cfi.so" + +#include "../vdso/vdso.S" diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index ece882f785721bdc463b747412e0d2a53bf317c2..9f9a06d219554bceb0ee3d8e526eadddff0e31ef 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -322,7 +322,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); static const pgprot_t protection_map[16] = { [VM_NONE] = PAGE_NONE, [VM_READ] = PAGE_READ, - [VM_WRITE] = PAGE_COPY, + [VM_WRITE] = PAGE_SHADOWSTACK, [VM_WRITE | VM_READ] = PAGE_COPY, [VM_EXEC] = PAGE_EXEC, [VM_EXEC | VM_READ] = PAGE_READ_EXEC, diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c index 7f0a449e970c9eb51938ec3c6d71b5f0d86e15b1..3a823853bab650659aa0f8a3410c517ba4b66434 100644 --- a/arch/riscv/mm/pgtable.c +++ b/arch/riscv/mm/pgtable.c @@ -155,3 +155,19 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, return pmd; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + +pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHADOW_STACK) + return pte_mkwrite_shstk(pte); + + return pte_mkwrite_novma(pte); +} + +pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHADOW_STACK) + return pmd_mkwrite_shstk(pmd); + + return pmd_mkwrite_novma(pmd); +} diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index 46cdc941f9586ac227c02a16705cf4397b629423..ac1e6277212b90437bf7611c4024a48a8ab9f31a 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -5,9 +5,6 @@ #define MAP_32BIT 0x40 /* only give out 32bit addresses */ #define MAP_ABOVE4G 0x80 /* only map above 4GB */ -/* Flags for map_shadow_stack(2) */ -#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ - #include #endif /* _ASM_X86_MMAN_H */ diff --git a/include/linux/cpu.h b/include/linux/cpu.h index e397c102cebe4df38117aefaf13191e7a8972b53..99ff8d21b67ab5309702dc1ed7b95f66f6ddc0e7 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -203,4 +203,8 @@ static inline bool cpu_mitigations_auto_nosmt(void) } #endif +int arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status); +int arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status); +int arch_lock_indir_br_lp_status(struct task_struct *t, unsigned long status); + #endif /* _LINUX_CPU_H_ */ diff --git a/include/linux/mm.h b/include/linux/mm.h index 33cde38f608cd25595da45e6bd21dcfa5857b89b..3410f2fa128a40792476711a0103dc5dfca40fa8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -350,8 +350,7 @@ extern unsigned int kobjsize(const void *objp); # define VM_PKEY_BIT4 0 #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ - -#ifdef CONFIG_X86_USER_SHADOW_STACK +#if defined(CONFIG_X86_USER_SHADOW_STACK) || defined(CONFIG_RISCV_USER_CFI) /* * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of * support core mm. @@ -4531,4 +4530,8 @@ static inline bool brk_thp_aligned_enabled(void) } #endif +int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status); +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status); +int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); + #endif /* _LINUX_MM_H */ diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h index 57e8195d0b538b213318ce552f32dd3a660171d5..5e3d61ddbd8c12cda03a48cb79bbe5b70ad3f77b 100644 --- a/include/uapi/asm-generic/mman.h +++ b/include/uapi/asm-generic/mman.h @@ -19,4 +19,8 @@ #define MCL_FUTURE 2 /* lock all future mappings */ #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ +#define SHADOW_STACK_SET_MARKER (1ULL << 1) /* Set up a top of stack marker in the shadow stack */ + + #endif /* __ASM_GENERIC_MMAN_H */ diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 036ab50decda283d6e5f1e89857e89fd886a55d9..fac41409b6630def564c91694e5e312f39e93483 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -449,6 +449,9 @@ typedef struct elf64_shdr { #define NT_RISCV_CSR 0x900 /* RISC-V Control and Status Registers */ #define NT_RISCV_VECTOR 0x901 /* RISC-V vector registers */ #define NT_RISCV_TAGGED_ADDR_CTRL 0x902 /* RISC-V tagged address control (prctl()) */ +#define NN_RISCV_USER_CFI "LINUX" +#define NT_RISCV_USER_CFI 0x903 /* RISC-V shadow stack state */ +#define NN_LOONGARCH_CPUCFG "LINUX" #define NT_LOONGARCH_CPUCFG 0xa00 /* LoongArch CPU config registers */ #define NT_LOONGARCH_CSR 0xa01 /* LoongArch control and status registers */ #define NT_LOONGARCH_LSX 0xa02 /* LoongArch Loongson SIMD Extension registers */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 8eba7bb12e4c0814e653aeba2051071f86340a75..04c6f44247759a2118b5b622c57b20183cfc607d 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -310,4 +310,57 @@ struct prctl_mm_map { # define PR_RISCV_V_VSTATE_CTRL_NEXT_MASK 0xc # define PR_RISCV_V_VSTATE_CTRL_MASK 0x1f +/* + * Get the current shadow stack configuration for the current thread, + * this will be the value configured via PR_SET_SHADOW_STACK_STATUS. + * + * Note: prctl numbers 71-76 are used to match glibc cfi-dev branch ABI. + * Mainline kernel uses 74-76 for shadow stack and 79-81 for landing pad, + * but glibc cfi-dev uses 71-73 for shadow stack and 74-76 for landing pad. + */ +#define PR_GET_SHADOW_STACK_STATUS 71 + +/* + * Set the current shadow stack configuration. Enabling the shadow + * stack will cause a shadow stack to be allocated for the thread. + */ +#define PR_SET_SHADOW_STACK_STATUS 72 +# define PR_SHADOW_STACK_ENABLE (1UL << 0) +# define PR_SHADOW_STACK_WRITE (1UL << 1) +# define PR_SHADOW_STACK_PUSH (1UL << 2) + +/* + * Prevent further changes to the specified shadow stack + * configuration. All bits may be locked via this call, including + * undefined bits. + */ +#define PR_LOCK_SHADOW_STACK_STATUS 73 + +/* + * Get the current indirect branch tracking configuration for the current + * thread, this will be the value configured via PR_SET_INDIR_BR_LP_STATUS. + */ +#define PR_GET_INDIR_BR_LP_STATUS 74 + +/* + * Set the indirect branch tracking configuration. PR_INDIR_BR_LP_ENABLE will + * enable cpu feature for user thread, to track all indirect branches and ensure + * they land on arch defined landing pad instruction. + * x86 - If enabled, an indirect branch must land on an ENDBRANCH instruction. + * arch64 - If enabled, an indirect branch must land on a BTI instruction. + * riscv - If enabled, an indirect branch must land on an lpad instruction. + * PR_INDIR_BR_LP_DISABLE will disable feature for user thread and indirect + * branches will no more be tracked by cpu to land on arch defined landing pad + * instruction. + */ +#define PR_SET_INDIR_BR_LP_STATUS 75 +# define PR_INDIR_BR_LP_ENABLE (1UL << 0) + +/* + * Prevent further changes to the specified indirect branch tracking + * configuration. All bits may be locked via this call, including + * undefined bits. + */ +#define PR_LOCK_INDIR_BR_LP_STATUS 76 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 5fb0fef7606570e1cbda2dc4eba1ee8af14e59ed..4de6bd91e9b0c6347e05c8d0a01b51ca62164266 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2304,8 +2304,38 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which, return -EINVAL; } +int __weak arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status) +{ + return -EINVAL; +} + +int __weak arch_set_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + +int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE) +int __weak arch_get_indir_br_lp_status(struct task_struct *t, unsigned long __user *status) +{ + return -EINVAL; +} + +int __weak arch_set_indir_br_lp_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + +int __weak arch_lock_indir_br_lp_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + #ifdef CONFIG_ANON_VMA_NAME #define ANON_VMA_NAME_MAX_LEN 80 @@ -2751,6 +2781,36 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_RISCV_V_GET_CONTROL: error = RISCV_V_GET_CONTROL(); break; + case PR_GET_INDIR_BR_LP_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_get_indir_br_lp_status(me, (unsigned long __user *)arg2); + break; + case PR_SET_INDIR_BR_LP_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_set_indir_br_lp_status(me, arg2); + break; + case PR_LOCK_INDIR_BR_LP_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_lock_indir_br_lp_status(me, arg2); + break; + case PR_GET_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_get_shadow_stack_status(me, (unsigned long __user *) arg2); + break; + case PR_SET_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_set_shadow_stack_status(me, arg2); + break; + case PR_LOCK_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_lock_shadow_stack_status(me, arg2); + break; default: error = -EINVAL; break; diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile index 449b880541e3e55170fcbaac43729fc412c85531..8810089c1e5e0bd1c125ee9b01f674a7a7dbfcb4 100644 --- a/tools/testing/selftests/riscv/Makefile +++ b/tools/testing/selftests/riscv/Makefile @@ -5,7 +5,7 @@ ARCH ?= $(shell uname -m 2>/dev/null || echo not) ifneq (,$(filter $(ARCH),riscv)) -RISCV_SUBTARGETS ?= abi hwprobe mm vector sse +RISCV_SUBTARGETS ?= abi hwprobe mm vector sse cfi else RISCV_SUBTARGETS := endif diff --git a/tools/testing/selftests/riscv/cfi/.gitignore b/tools/testing/selftests/riscv/cfi/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..c1faf7ca4346582a877655198066d98a7534c305 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/.gitignore @@ -0,0 +1,2 @@ +cfitests +shadowstack diff --git a/tools/testing/selftests/riscv/cfi/Makefile b/tools/testing/selftests/riscv/cfi/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..a41495ece305fca883144f7ec93cda0236edc1b3 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/Makefile @@ -0,0 +1,23 @@ +CFLAGS += $(KHDR_INCLUDES) +CFLAGS += -I$(top_srcdir)/tools/include + +CFLAGS += -march=rv64gc_zicfilp_zicfiss -fcf-protection=full + +# Check for zicfi* extensions needs cross compiler +# which is not set until lib.mk is included +ifeq ($(LLVM)$(CC),cc) +CC := $(CROSS_COMPILE)gcc +endif + + +ifeq ($(shell $(CC) $(CFLAGS) -nostdlib -xc /dev/null -o /dev/null > /dev/null 2>&1; echo $$?),0) +TEST_GEN_PROGS := cfitests + +cfitests: cfitests.c shadowstack.c + $(CC) -o $@ $(CFLAGS) $(LDFLAGS) $^ +else + +$(shell echo "Toolchain doesn't support CFI, skipping CFI kselftest." >&2) +endif + +include ../../lib.mk diff --git a/tools/testing/selftests/riscv/cfi/cfi_rv_test.h b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h new file mode 100644 index 0000000000000000000000000000000000000000..1c8043f2b778b7127adb42522a190f7e6c36cbe9 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef SELFTEST_RISCV_CFI_H +#define SELFTEST_RISCV_CFI_H +#include +#include +#include "shadowstack.h" + +#define CHILD_EXIT_CODE_SSWRITE 10 +#define CHILD_EXIT_CODE_SIG_TEST 11 + +#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \ +({ \ + register long _num __asm__ ("a7") = (num); \ + register long _arg1 __asm__ ("a0") = (long)(arg1); \ + register long _arg2 __asm__ ("a1") = (long)(arg2); \ + register long _arg3 __asm__ ("a2") = (long)(arg3); \ + register long _arg4 __asm__ ("a3") = (long)(arg4); \ + register long _arg5 __asm__ ("a4") = (long)(arg5); \ + \ + __asm__ volatile( \ + "ecall\n" \ + : "+r" \ + (_arg1) \ + : "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \ + "r"(_num) \ + : "memory", "cc" \ + ); \ + _arg1; \ +}) + +#define my_syscall3(num, arg1, arg2, arg3) \ +({ \ + register long _num __asm__ ("a7") = (num); \ + register long _arg1 __asm__ ("a0") = (long)(arg1); \ + register long _arg2 __asm__ ("a1") = (long)(arg2); \ + register long _arg3 __asm__ ("a2") = (long)(arg3); \ + \ + __asm__ volatile( \ + "ecall\n" \ + : "+r" (_arg1) \ + : "r"(_arg2), "r"(_arg3), \ + "r"(_num) \ + : "memory", "cc" \ + ); \ + _arg1; \ +}) + +#ifndef __NR_prctl +#define __NR_prctl 167 +#endif + +#ifndef __NR_map_shadow_stack +#define __NR_map_shadow_stack 453 +#endif + +#define CSR_SSP 0x011 + +#ifdef __ASSEMBLY__ +#define __ASM_STR(x) x +#else +#define __ASM_STR(x) #x +#endif + +#define csr_read(csr) \ +({ \ + register unsigned long __v; \ + __asm__ __volatile__ ("csrr %0, " __ASM_STR(csr) \ + : "=r" (__v) : \ + : "memory"); \ + __v; \ +}) + +#define csr_write(csr, val) \ +({ \ + unsigned long __v = (unsigned long)(val); \ + __asm__ __volatile__ ("csrw " __ASM_STR(csr) ", %0" \ + : : "rK" (__v) \ + : "memory"); \ +}) + +#endif diff --git a/tools/testing/selftests/riscv/cfi/cfitests.c b/tools/testing/selftests/riscv/cfi/cfitests.c new file mode 100644 index 0000000000000000000000000000000000000000..d4e4924ea3256074778b017f29182eca34fbb18e --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/cfitests.c @@ -0,0 +1,173 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../../kselftest.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +// #include +#include "cfi_rv_test.h" + +/* do not optimize cfi related test functions */ +#pragma GCC push_options +#pragma GCC optimize("O0") + +void sigsegv_handler(int signum, siginfo_t *si, void *uc) +{ + struct ucontext *ctx = (struct ucontext *)uc; + + if (si->si_code == SEGV_CPERR) { + ksft_print_msg("Control flow violation happened somewhere\n"); + ksft_print_msg("PC where violation happened %lx\n", ctx->uc_mcontext.gregs[0]); + exit(-1); + } + + /* all other cases are expected to be of shadow stack write case */ + exit(CHILD_EXIT_CODE_SSWRITE); +} + +bool register_signal_handler(void) +{ + struct sigaction sa = {}; + + sa.sa_sigaction = sigsegv_handler; + sa.sa_flags = SA_SIGINFO; + if (sigaction(SIGSEGV, &sa, NULL)) { + ksft_print_msg("Registering signal handler for landing pad violation failed\n"); + return false; + } + + return true; +} + +long ptrace(int request, pid_t pid, void *addr, void *data); + +bool cfi_ptrace_test(void) +{ + pid_t pid; + int status, ret = 0; + unsigned long ptrace_test_num = 0, total_ptrace_tests = 2; + + struct user_cfi_state cfi_reg; + struct iovec iov; + + pid = fork(); + + if (pid == -1) { + ksft_exit_fail_msg("%s: fork failed\n", __func__); + exit(1); + } + + if (pid == 0) { + /* allow to be traced */ + ptrace(PTRACE_TRACEME, 0, NULL, NULL); + raise(SIGSTOP); + asm volatile ("la a5, 1f\n" + "jalr a5\n" + "nop\n" + "nop\n" + "1: nop\n" + : : : "a5"); + exit(11); + /* child shouldn't go beyond here */ + } + + /* parent's code goes here */ + iov.iov_base = &cfi_reg; + iov.iov_len = sizeof(cfi_reg); + + while (ptrace_test_num < total_ptrace_tests) { + memset(&cfi_reg, 0, sizeof(cfi_reg)); + waitpid(pid, &status, 0); + if (WIFSTOPPED(status)) { + errno = 0; + ret = ptrace(PTRACE_GETREGSET, pid, (void *)NT_RISCV_USER_CFI, &iov); + if (ret == -1 && errno) + ksft_exit_fail_msg("%s: PTRACE_GETREGSET failed\n", __func__); + } else { + ksft_exit_fail_msg("%s: child didn't stop, failed\n", __func__); + } + + switch (ptrace_test_num) { +#define CFI_ENABLE_MASK (PTRACE_CFI_LP_EN_STATE | \ + PTRACE_CFI_SS_EN_STATE | \ + PTRACE_CFI_SS_PTR_STATE) + case 0: + if ((cfi_reg.cfi_status.cfi_state & CFI_ENABLE_MASK) != CFI_ENABLE_MASK) + ksft_exit_fail_msg("%s: ptrace_getregset failed, %llu\n", __func__, + cfi_reg.cfi_status.cfi_state); + if (!cfi_reg.shstk_ptr) + ksft_exit_fail_msg("%s: NULL shadow stack pointer, test failed\n", + __func__); + break; + case 1: + if (!(cfi_reg.cfi_status.cfi_state & PTRACE_CFI_ELP_STATE)) + ksft_exit_fail_msg("%s: elp must have been set\n", __func__); + /* clear elp state. not interested in anything else */ + cfi_reg.cfi_status.cfi_state = 0; + + ret = ptrace(PTRACE_SETREGSET, pid, (void *)NT_RISCV_USER_CFI, &iov); + if (ret == -1 && errno) + ksft_exit_fail_msg("%s: PTRACE_GETREGSET failed\n", __func__); + break; + default: + ksft_exit_fail_msg("%s: unreachable switch case\n", __func__); + break; + } + ptrace(PTRACE_CONT, pid, NULL, NULL); + ptrace_test_num++; + } + + waitpid(pid, &status, 0); + if (WEXITSTATUS(status) != 11) + ksft_print_msg("%s, bad return code from child\n", __func__); + + ksft_print_msg("%s, ptrace test succeeded\n", __func__); + return true; +} + +int main(int argc, char *argv[]) +{ + int ret = 0; + unsigned long lpad_status = 0, ss_status = 0; + + ksft_print_header(); + + ksft_print_msg("Starting risc-v tests\n"); + + /* + * Landing pad test. Not a lot of kernel changes to support landing + * pads for user mode except lighting up a bit in senvcfg via a prctl. + * Enable landing pad support throughout the execution of the test binary. + */ + ret = my_syscall5(__NR_prctl, PR_GET_INDIR_BR_LP_STATUS, &lpad_status, 0, 0, 0); + if (ret) + ksft_exit_fail_msg("Get landing pad status failed with %d\n", ret); + + if (!(lpad_status & PR_INDIR_BR_LP_ENABLE)) + ksft_exit_fail_msg("Landing pad is not enabled, should be enabled via glibc\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) + ksft_exit_fail_msg("Get shadow stack failed with %d\n", ret); + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_exit_fail_msg("Shadow stack is not enabled, should be enabled via glibc\n"); + + if (!register_signal_handler()) + ksft_exit_fail_msg("Registering signal handler for SIGSEGV failed\n"); + + ksft_print_msg("Landing pad and shadow stack are enabled for binary\n"); + cfi_ptrace_test(); + + execute_shadow_stack_tests(); + + return 0; +} + +#pragma GCC pop_options diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.c b/tools/testing/selftests/riscv/cfi/shadowstack.c new file mode 100644 index 0000000000000000000000000000000000000000..f8eed8260a12524cf13a904c9e4bb71a9c8f3051 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/shadowstack.c @@ -0,0 +1,385 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../../kselftest.h" +#include +#include +#include +#include +#include +#include "shadowstack.h" +#include "cfi_rv_test.h" + +static struct shadow_stack_tests shstk_tests[] = { + { "shstk fork test\n", shadow_stack_fork_test }, + { "map shadow stack syscall\n", shadow_stack_map_test }, + { "shadow stack gup tests\n", shadow_stack_gup_tests }, + { "shadow stack signal tests\n", shadow_stack_signal_test}, + { "memory protections of shadow stack memory\n", shadow_stack_protection_test } +}; + +#define RISCV_SHADOW_STACK_TESTS ARRAY_SIZE(shstk_tests) + +/* do not optimize shadow stack related test functions */ +#pragma GCC push_options +#pragma GCC optimize("O0") + +void zar(void) +{ + unsigned long ssp = 0; + + ssp = csr_read(CSR_SSP); + ksft_print_msg("Spewing out shadow stack ptr: %lx\n" + " This is to ensure shadow stack is indeed enabled and working\n", + ssp); +} + +void bar(void) +{ + zar(); +} + +void foo(void) +{ + bar(); +} + +void zar_child(void) +{ + unsigned long ssp = 0; + + ssp = csr_read(CSR_SSP); + ksft_print_msg("Spewing out shadow stack ptr: %lx\n" + " This is to ensure shadow stack is indeed enabled and working\n", + ssp); +} + +void bar_child(void) +{ + zar_child(); +} + +void foo_child(void) +{ + bar_child(); +} + +typedef void (call_func_ptr)(void); +/* + * call couple of functions to test push/pop. + */ +int shadow_stack_call_tests(call_func_ptr fn_ptr, bool parent) +{ + ksft_print_msg("dummy calls for sspush and sspopchk in context of %s\n", + parent ? "parent" : "child"); + + (fn_ptr)(); + + return 0; +} + +/* forks a thread, and ensure shadow stacks fork out */ +bool shadow_stack_fork_test(unsigned long test_num, void *ctx) +{ + int pid = 0, child_status = 0, parent_pid = 0, ret = 0; + unsigned long ss_status = 0; + + ksft_print_msg("Exercising shadow stack fork test\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) { + ksft_exit_skip("Shadow stack get status prctl failed with errorcode %d\n", ret); + return false; + } + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_exit_skip("Shadow stack is not enabled, should be enabled via glibc\n"); + + parent_pid = getpid(); + pid = fork(); + + if (pid) { + ksft_print_msg("Parent pid %d and child pid %d\n", parent_pid, pid); + shadow_stack_call_tests(&foo, true); + } else { + shadow_stack_call_tests(&foo_child, false); + } + + if (pid) { + ksft_print_msg("Waiting on child to finish\n"); + wait(&child_status); + } else { + /* exit child gracefully */ + exit(0); + } + + if (pid && WIFSIGNALED(child_status)) { + ksft_print_msg("Child faulted, fork test failed\n"); + return false; + } + + return true; +} + +/* exercise 'map_shadow_stack', pivot to it and call some functions to ensure it works */ +#define SHADOW_STACK_ALLOC_SIZE 4096 +bool shadow_stack_map_test(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr; + int ret = 0; + + ksft_print_msg("Exercising shadow stack map test\n"); + + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", + (int)shdw_addr); + return false; + } + + ret = munmap((void *)shdw_addr, SHADOW_STACK_ALLOC_SIZE); + + if (ret) { + ksft_print_msg("munmap failed with error code %d\n", ret); + return false; + } + + return true; +} + +/* + * shadow stack protection tests. map a shadow stack and + * validate all memory protections work on it + */ +bool shadow_stack_protection_test(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr; + unsigned long *write_addr = NULL; + int ret = 0, pid = 0, child_status = 0; + + ksft_print_msg("Exercising shadow stack protection test (WPT)\n"); + + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", + (int)shdw_addr); + return false; + } + + write_addr = (unsigned long *)shdw_addr; + pid = fork(); + + /* no child was created, return false */ + if (pid == -1) + return false; + + /* + * try to perform a store from child on shadow stack memory + * it should result in SIGSEGV + */ + if (!pid) { + /* below write must lead to SIGSEGV */ + *write_addr = 0xdeadbeef; + } else { + wait(&child_status); + } + + /* test fail, if 0xdeadbeef present on shadow stack address */ + if (*write_addr == 0xdeadbeef) { + ksft_print_msg("Shadow stack WPT failed\n"); + return false; + } + + /* if child reached here, then fail */ + if (!pid) { + ksft_print_msg("Shadow stack WPT failed: child reached unreachable state\n"); + return false; + } + + /* if child exited via signal handler but not for write on ss */ + if (WIFEXITED(child_status) && + WEXITSTATUS(child_status) != CHILD_EXIT_CODE_SSWRITE) { + ksft_print_msg("Shadow stack WPT failed: child wasn't signaled for write\n"); + return false; + } + + ret = munmap(write_addr, SHADOW_STACK_ALLOC_SIZE); + if (ret) { + ksft_print_msg("Shadow stack WPT failed: munmap failed, error code %d\n", + ret); + return false; + } + + return true; +} + +#define SS_MAGIC_WRITE_VAL 0xbeefdead + +int gup_tests(int mem_fd, unsigned long *shdw_addr) +{ + unsigned long val = 0; + + lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET); + if (read(mem_fd, &val, sizeof(val)) < 0) { + ksft_print_msg("Reading shadow stack mem via gup failed\n"); + return 1; + } + + val = SS_MAGIC_WRITE_VAL; + lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET); + if (write(mem_fd, &val, sizeof(val)) < 0) { + ksft_print_msg("Writing shadow stack mem via gup failed\n"); + return 1; + } + + if (*shdw_addr != SS_MAGIC_WRITE_VAL) { + ksft_print_msg("GUP write to shadow stack memory failed\n"); + return 1; + } + + return 0; +} + +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr = 0; + unsigned long *write_addr = NULL; + int fd = 0; + bool ret = false; + + ksft_print_msg("Exercising shadow stack gup tests\n"); + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", (int)shdw_addr); + return false; + } + + write_addr = (unsigned long *)shdw_addr; + + fd = open("/proc/self/mem", O_RDWR); + if (fd == -1) + return false; + + if (gup_tests(fd, write_addr)) { + ksft_print_msg("gup tests failed\n"); + goto out; + } + + ret = true; +out: + if (shdw_addr && munmap(write_addr, SHADOW_STACK_ALLOC_SIZE)) { + ksft_print_msg("munmap failed with error code %d\n", ret); + ret = false; + } + + return ret; +} + +volatile bool break_loop; + +void sigusr1_handler(int signo) +{ + break_loop = true; +} + +bool sigusr1_signal_test(void) +{ + struct sigaction sa = {}; + + sa.sa_handler = sigusr1_handler; + sa.sa_flags = 0; + sigemptyset(&sa.sa_mask); + if (sigaction(SIGUSR1, &sa, NULL)) { + ksft_print_msg("Registering signal handler for SIGUSR1 failed\n"); + return false; + } + + return true; +} + +/* + * shadow stack signal test. shadow stack must be enabled. + * register a signal, fork another thread which is waiting + * on signal. Send a signal from parent to child, verify + * that signal was received by child. If not test fails + */ +bool shadow_stack_signal_test(unsigned long test_num, void *ctx) +{ + int pid = 0, child_status = 0, ret = 0; + unsigned long ss_status = 0; + + ksft_print_msg("Exercising shadow stack signal test\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) { + ksft_print_msg("Shadow stack get status prctl failed with errorcode %d\n", ret); + return false; + } + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_print_msg("Shadow stack is not enabled, should be enabled via glibc\n"); + + /* this should be caught by signal handler and do an exit */ + if (!sigusr1_signal_test()) { + ksft_print_msg("Registering sigusr1 handler failed\n"); + exit(-1); + } + + pid = fork(); + + if (pid == -1) { + ksft_print_msg("Signal test: fork failed\n"); + goto out; + } + + if (pid == 0) { + while (!break_loop) + sleep(1); + + exit(11); + /* child shouldn't go beyond here */ + } + + /* send SIGUSR1 to child */ + kill(pid, SIGUSR1); + wait(&child_status); + +out: + + return (WIFEXITED(child_status) && + WEXITSTATUS(child_status) == 11); +} + +int execute_shadow_stack_tests(void) +{ + int ret = 0; + unsigned long test_count = 0; + unsigned long shstk_status = 0; + bool test_pass = false; + + ksft_print_msg("Executing RISC-V shadow stack self tests\n"); + ksft_set_plan(RISCV_SHADOW_STACK_TESTS); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &shstk_status, 0, 0, 0); + + if (ret != 0) + ksft_exit_fail_msg("Get shadow stack status failed with %d\n", ret); + + /* + * If we are here that means get shadow stack status succeeded and + * thus shadow stack support is baked in the kernel. + */ + while (test_count < RISCV_SHADOW_STACK_TESTS) { + test_pass = (*shstk_tests[test_count].t_func)(test_count, NULL); + ksft_test_result(test_pass, shstk_tests[test_count].name); + test_count++; + } + + ksft_finished(); + + return 0; +} + +#pragma GCC pop_options diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.h b/tools/testing/selftests/riscv/cfi/shadowstack.h new file mode 100644 index 0000000000000000000000000000000000000000..943a3685905f52447e24c97140ed6ad60213f05f --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/shadowstack.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef SELFTEST_SHADOWSTACK_TEST_H +#define SELFTEST_SHADOWSTACK_TEST_H +#include +#include + +/* + * A CFI test returns true for success or false for fail. + * Takes a test number to index into array, and a void pointer. + */ +typedef bool (*shstk_test_func)(unsigned long test_num, void *); + +struct shadow_stack_tests { + char *name; + shstk_test_func t_func; +}; + +bool shadow_stack_fork_test(unsigned long test_num, void *ctx); +bool shadow_stack_map_test(unsigned long test_num, void *ctx); +bool shadow_stack_protection_test(unsigned long test_num, void *ctx); +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx); +bool shadow_stack_signal_test(unsigned long test_num, void *ctx); + +int execute_shadow_stack_tests(void); + +#endif