Skip to content

Commit 041aa7a

Browse files
mrutland-armThomas Gleixner
authored andcommitted
entry: Split preemption from irqentry_exit_to_kernel_mode()
Some architecture-specific work needs to be performed between the state management for exception entry/exit and the "real" work to handle the exception. For example, arm64 needs to manipulate a number of exception masking bits, with different exceptions requiring different masking. Generally this can all be hidden in the architecture code, but for arm64 the current structure of irqentry_exit_to_kernel_mode() makes this particularly difficult to handle in a way that is correct, maintainable, and efficient. The gory details are described in the thread surrounding: https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/ The summary is: * Currently, irqentry_exit_to_kernel_mode() handles both involuntary preemption AND state management necessary for exception return. * When scheduling (including involuntary preemption), arm64 needs to have all arm64-specific exceptions unmasked, though regular interrupts must be masked. * Prior to the state management for exception return, arm64 needs to mask a number of arm64-specific exceptions, and perform some work with these exceptions masked (with RCU watching, etc). While in theory it is possible to handle this with a new arch_*() hook called somewhere under irqentry_exit_to_kernel_mode(), this is fragile and complicated, and doesn't match the flow used for exception return to user mode, which has a separate 'prepare' step (where preemption can occur) prior to the state management. To solve this, refactor irqentry_exit_to_kernel_mode() to match the style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic into a new irqentry_exit_to_kernel_mode_preempt() function, and moving state management in a new irqentry_exit_to_kernel_mode_after_preempt() function. The existing irqentry_exit_to_kernel_mode() is left as a caller of both of these, avoiding the need to modify existing callers. There should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260407131650.3813777-6-mark.rutland@arm.com
1 parent c5538d0 commit 041aa7a

1 file changed

Lines changed: 59 additions & 14 deletions

File tree

include/linux/irq-entry-common.h

Lines changed: 59 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -438,24 +438,46 @@ static __always_inline irqentry_state_t irqentry_enter_from_kernel_mode(struct p
438438
}
439439

440440
/**
441-
* irqentry_exit_to_kernel_mode - Run preempt checks and establish state after
442-
* invoking the interrupt handler
441+
* irqentry_exit_to_kernel_mode_preempt - Run preempt checks on return to kernel mode
443442
* @regs: Pointer to current's pt_regs
444443
* @state: Return value from matching call to irqentry_enter_from_kernel_mode()
445444
*
446-
* This is the counterpart of irqentry_enter_from_kernel_mode() and runs the
447-
* necessary preemption check if possible and required. It returns to the caller
448-
* with interrupts disabled and the correct state vs. tracing, lockdep and RCU
449-
* required to return to the interrupted context.
445+
* This is to be invoked before irqentry_exit_to_kernel_mode_after_preempt() to
446+
* allow kernel preemption on return from interrupt.
447+
*
448+
* Must be invoked with interrupts disabled and CPU state which allows kernel
449+
* preemption.
450450
*
451-
* It is the last action before returning to the low level ASM code which just
452-
* needs to return.
451+
* After returning from this function, the caller can modify CPU state before
452+
* invoking irqentry_exit_to_kernel_mode_after_preempt(), which is required to
453+
* re-establish the tracing, lockdep and RCU state for returning to the
454+
* interrupted context.
453455
*/
454-
static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs,
455-
irqentry_state_t state)
456+
static inline void irqentry_exit_to_kernel_mode_preempt(struct pt_regs *regs,
457+
irqentry_state_t state)
456458
{
457-
lockdep_assert_irqs_disabled();
459+
if (regs_irqs_disabled(regs) || state.exit_rcu)
460+
return;
461+
462+
if (IS_ENABLED(CONFIG_PREEMPTION))
463+
irqentry_exit_cond_resched();
464+
}
458465

466+
/**
467+
* irqentry_exit_to_kernel_mode_after_preempt - Establish trace, lockdep and RCU state
468+
* @regs: Pointer to current's pt_regs
469+
* @state: Return value from matching call to irqentry_enter_from_kernel_mode()
470+
*
471+
* This is to be invoked after irqentry_exit_to_kernel_mode_preempt() and before
472+
* actually returning to the interrupted context.
473+
*
474+
* There are no requirements for the CPU state other than being able to complete
475+
* the tracing, lockdep and RCU state transitions. After this function returns
476+
* the caller must return directly to the interrupted context.
477+
*/
478+
static __always_inline void
479+
irqentry_exit_to_kernel_mode_after_preempt(struct pt_regs *regs, irqentry_state_t state)
480+
{
459481
if (!regs_irqs_disabled(regs)) {
460482
/*
461483
* If RCU was not watching on entry this needs to be done
@@ -474,9 +496,6 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs,
474496
}
475497

476498
instrumentation_begin();
477-
if (IS_ENABLED(CONFIG_PREEMPTION))
478-
irqentry_exit_cond_resched();
479-
480499
/* Covers both tracing and lockdep */
481500
trace_hardirqs_on();
482501
instrumentation_end();
@@ -490,6 +509,32 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs,
490509
}
491510
}
492511

512+
/**
513+
* irqentry_exit_to_kernel_mode - Run preempt checks and establish state after
514+
* invoking the interrupt handler
515+
* @regs: Pointer to current's pt_regs
516+
* @state: Return value from matching call to irqentry_enter_from_kernel_mode()
517+
*
518+
* This is the counterpart of irqentry_enter_from_kernel_mode() and combines
519+
* the calls to irqentry_exit_to_kernel_mode_preempt() and
520+
* irqentry_exit_to_kernel_mode_after_preempt().
521+
*
522+
* The requirement for the CPU state is that it can schedule. After the function
523+
* returns the tracing, lockdep and RCU state transitions are completed and the
524+
* caller must return directly to the interrupted context.
525+
*/
526+
static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs,
527+
irqentry_state_t state)
528+
{
529+
lockdep_assert_irqs_disabled();
530+
531+
instrumentation_begin();
532+
irqentry_exit_to_kernel_mode_preempt(regs, state);
533+
instrumentation_end();
534+
535+
irqentry_exit_to_kernel_mode_after_preempt(regs, state);
536+
}
537+
493538
/**
494539
* irqentry_enter - Handle state tracking on ordinary interrupt entries
495540
* @regs: Pointer to pt_regs of interrupted context

0 commit comments

Comments
 (0)