Skip to content

Commit 477a235

Browse files
Gavin Shangregkh
authored andcommitted
powerpc/eeh: Enable IO path on permanent error
[ Upstream commit 387bbc974f6adf91aa635090f73434ed10edd915 ] We give up recovery on permanent error, simply shutdown the affected devices and remove them. If the devices can't be put into quiet state, they spew more traffic that is likely to cause another unexpected EEH error. This was observed on "p8dtu2u" machine: 0002:00:00.0 PCI bridge: IBM Device 03dc 0002:01:00.0 Ethernet controller: Intel Corporation \ Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.1 Ethernet controller: Intel Corporation \ Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.2 Ethernet controller: Intel Corporation \ Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.3 Ethernet controller: Intel Corporation \ Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) On P8 PowerNV platform, the IO path is frozen when shutdowning the devices, meaning the memory registers are inaccessible. It is why the devices can't be put into quiet state before removing them. This fixes the issue by enabling IO path prior to putting the devices into quiet state. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent e1db592 commit 477a235

1 file changed

Lines changed: 9 additions & 1 deletion

File tree

arch/powerpc/kernel/eeh.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,9 +304,17 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
304304
*
305305
* For pHyp, we have to enable IO for log retrieval. Otherwise,
306306
* 0xFF's is always returned from PCI config space.
307+
*
308+
* When the @severity is EEH_LOG_PERM, the PE is going to be
309+
* removed. Prior to that, the drivers for devices included in
310+
* the PE will be closed. The drivers rely on working IO path
311+
* to bring the devices to quiet state. Otherwise, PCI traffic
312+
* from those devices after they are removed is like to cause
313+
* another unexpected EEH error.
307314
*/
308315
if (!(pe->type & EEH_PE_PHB)) {
309-
if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG))
316+
if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG) ||
317+
severity == EEH_LOG_PERM)
310318
eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
311319

312320
/*

0 commit comments

Comments
 (0)