Skip to content

Commit 2a8225e

Browse files
KAGA-KOKOgregkh
authored andcommitted
sched/cputime: Fix steal time accounting vs. CPU hotplug
commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream. On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time value over CPU down and up. So after the CPU comes up again the delta calculation in steal_account_process_tick() wreckages itself due to the unsigned math: u64 steal = paravirt_steal_clock(smp_processor_id()); steal -= this_rq()->prev_steal_time; So if steal is smaller than rq->prev_steal_time we end up with an insane large value which then gets added to rq->prev_steal_time, resulting in a permanent wreckage of the accounting. As a consequence the per CPU stats in /proc/stat become stale. Nice trick to tell the world how idle the system is (100%) while the CPU is 100% busy running tasks. Though we prefer realistic numbers. None of the accounting values which use a previous value to account for fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity check for prev_irq_time and prev_steal_time_rq, but that sanity check solely deals with clock warps and limits the /proc/stat visible wreckage. The prev_time values are still wrong. Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: commit 095c0aa "sched: adjust scheduler cpu power for stolen time" Fixes: commit aa48380 "sched: Remove irq time from available CPU power" Fixes: commit e6e6685 "KVM guest: Steal time accounting" Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent 9835db3 commit 2a8225e

2 files changed

Lines changed: 14 additions & 0 deletions

File tree

kernel/sched/core.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5525,6 +5525,7 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu)
55255525

55265526
case CPU_UP_PREPARE:
55275527
rq->calc_load_update = calc_load_update;
5528+
account_reset_rq(rq);
55285529
break;
55295530

55305531
case CPU_ONLINE:

kernel/sched/sched.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1770,3 +1770,16 @@ static inline u64 irq_time_read(int cpu)
17701770
}
17711771
#endif /* CONFIG_64BIT */
17721772
#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
1773+
1774+
static inline void account_reset_rq(struct rq *rq)
1775+
{
1776+
#ifdef CONFIG_IRQ_TIME_ACCOUNTING
1777+
rq->prev_irq_time = 0;
1778+
#endif
1779+
#ifdef CONFIG_PARAVIRT
1780+
rq->prev_steal_time = 0;
1781+
#endif
1782+
#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
1783+
rq->prev_steal_time_rq = 0;
1784+
#endif
1785+
}

0 commit comments

Comments
 (0)